MensaarLecker Development Log 2 -- Web Developing and GitHub Workflow

Projects Python Scraper Selenium

Created At : 2025-03-21 05:03

Count:1.2k Views :

Motivation
Fetching Data from Web Development
1. Fetch Data from Google Sheets using Publish
2. CORS Policy and XSS
Second Attempt – Google Apps Script
1. Deploying the function and fetch the data
Webpage Implementation
1. Static data – Tripe Quotes
2. Enhance the design using DataTable
Automation with GitHub Workflow

Repository: MensaarLecker

Motivation

Previous Post: MensaarLecker Development Log 1 – Web Crawling

Me and my friends ~~hate~~love the UdS Mensa so much! The infinite frozen food and french fries menus give us so much energy and motivation for the 5-hour afternoon coding marathon. However, no one actually knows how many potatoes they have exterminated throughout the week. We have a genius webpage created by some Schnitzel lover. Personally, I like its minimalistic layout and determination on Schnitzel searching.

However, we want more.

It’s not just Schnitzel; we want to know everything about their menu. We want to know what’s inside the mensa ladies’ brains when they design next week’s menu.

The desire never ends. We need more data, more details, more, More, MORE!

Fetching Data from Web Development

Continuing from last post, we have already implemented a script that collect the Mensa menu and stored it on Google Sheets. It is time to build our web interface to connect the database.

Fetch Data from Google Sheets using Publish

First, we need to publish our spreadsheet so that it is public to fetch the data.

In the Spreadsheet, click Share → Change access to Anyone with the link.
Click File → Share → Publish to the web.
Select Entire Document → Comma-separated values (.csv) and click Publish.
Copy the public CSV link.

SCRIPT_URL = {PUBLISH_LINK}

# Fetch JSON data
def fetch_menu():
    try:
        response = requests.get(SCRIPT_URL)
        response.raise_for_status()  # Raise error if bad response
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"❌ Error fetching menu: {e}")
        return []

However, the script return no data, why?

Access to fetch at 'https://docs.google.com/spreadsheets/...' from origin 'null' has been blocked 
by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. 
If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

CORS Policy and XSS

Cross-origin resource sharing (CORS) is an extension of the same-origin policy. You need it for authorized resource sharing with external third parties. – Amazon Web Services

This is a cyber security scheme to avoid XSS (Cross-site scripting), in a nutshell, when we run the script, the code cannot proceed because it doesn’t login to any Google account! You can imagine the request is block by a imaginary login page and our program doesn’t know how to react.

Second Attempt – Google Apps Script

Google doesn’t allow users to fetch their data casually, except this is executed under Google’s server. This means we need to run our fetching function using Google’s service. Apps Script provide a JavaScript editor to save your code.

function doGet() {
  var sheet = SpreadsheetApp.openById("PUT_YOUR_SHEET_ID_HERE").getActiveSheet();
  var data = sheet.getDataRange().getValues();

  var headers = data[0];
  var jsonData = [];

  for (var i = 1; i < data.length; i++) {
    var row = {};
    for (var j = 0; j < headers.length; j++) {
      row[headers[j]] = data[i][j];
    }
    jsonData.push(row);
  }

  var output = ContentService.createTextOutput(JSON.stringify(jsonData));
  output.setMimeType(ContentService.MimeType.JSON);
  return output;
}

To get the sheet ID, we can simply open the sheet and it is part of the URL:

Deploying the function and fetch the data

Afterwards, we can deploy this function and it will generate a unique URL for the function output and we can fetch the data (here we export the data in json format) and use it in our code.

You can also double check the URL and make sure it does return the value correctly → link

SCRIPT_URL = "URL_DEPLOYED_FROM_APP_SCRIPT";

# Fetch JSON data
def fetch_menu():
    try:
        response = requests.get(SCRIPT_URL)
        response.raise_for_status()  # Raise error if bad response
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"❌ Error fetching menu: {e}")
        return []

Webpage Implementation

To simplify our automation process on GitHub, we will continue implement our HTML code using Python. Our website should have two pages: index.html contains today’s menu, and menu.html contains the history of collected menus.

Static data – Tripe Quotes

We can put all the static code in strings. In Python we can store multi-line strings using triple quotes.

html = f"""<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Mensaar Today</title>
    <style>
        body {{ font-family: Arial, sans-serif; padding: 20px; text-align: center; background-image: url('src/uds_spirit.jpg'); }}
        h1 {{
            background: rgba(255, 255, 255, 0.8); 
            color: #003C71;
            padding: 10px 20px;
            display: inline-block; 
            border-radius: 10px; 
            box-shadow: 2px 2px 10px rgba(0, 0, 0, 0.2); 
        }}
        .container {{ width: 80%; margin: auto; }}
        .closed-message {{ font-size: 26px; color: red; font-weight: bold; padding: 20px; background: #fff3f3; border-radius: 10px; }}
        .menu-card {{ background: white; padding: 15px; margin: 10px 0; border-radius: 10px; box-shadow: 2px 2px 10px rgba(0,0,0,0.1); text-align: left; }}
        .meal-title {{ font-size: 20px; font-weight: bold; }}
        .meal-components {{ font-size: 16px; color: #666; }}
        .meal-frequency {{ font-size: 14px; color: #888; font-style: italic; }}
        .button {{ padding: 12px 20px; background: #007bff; color: white; border-radius: 5px; text-decoration: none; }}
    </style>
</head>
<body>
    <h1>Mensaar Menu for {today}</h1></br>
    <a href="menu.html" class="button">📜 View Full Menu</a>
    <div class="container">
    """

Enhance the design using DataTable

Since we will collect the menu everyday, the table in menu.html will become too long for loading and hard to check. We can use DataTable that provide basic table layout like filtering, searching, and sorting. Also, it is very easy to implement, simply include the JavaScript and CSS link in the HTML code and you can get the basic, but decent design.

Automation with GitHub Workflow

Finally, after we deployed the code to GitHub, remember our original goal:

Scrape the Mensa menu every weekday and store it to Google Sheets
Fetch the Data Collection from Google Sheets and update the website

In fact, we can run the python script periodically using Github workflow, here are the steps:

The workflow has to be implemented in .yml format, and stored in .github/workflows/{workflow_name}.yml

Before running the python script, make sure python is set with all the dependencies installed:

- name: 🛠 Set up Python
    uses: actions/setup-python@v4
    with:
        python-version: "3.x"

- name: 📦 Install dependencies
    run: |
        pip install requests selenium webdriver-manager gspread oauth2client

- name: 🚀 Run Mensaar Scraper (update Google Sheets)
    run: |
        echo "🧪 Starting Mensaar_scraper..."
        python Mensaar_scraper.py
        echo "✅ Scraper completed."

- name: 🖼️ Run HTML Generator
    run: |
        echo "🧪 Generating index.html & menu.html"
        python generate_menu.py

For the full workflow, you can find the skeleton code template on GitHub or you can check here.

Please cite the source for reprints, feel free to verify the sources cited in the article, and point out any errors or lack of clarity of expression. You can comment in the comments section below or email to GreenMeeple@yahoo.com