mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 02:03:55 -05:00
refactor(scripts): standardize script naming convention for clarity
This commit is contained in:
@@ -54,7 +54,7 @@ repos:
|
||||
# --- Content Formatting ---
|
||||
- id: collapse-extra-blank-lines
|
||||
name: "Collapse extra blank lines"
|
||||
entry: python tools/scripts/content/collapse_blank_lines.py
|
||||
entry: python tools/scripts/content/format_blank_lines.py
|
||||
language: python
|
||||
pass_filenames: true
|
||||
files: ^quarto/contents/.*\.qmd$
|
||||
@@ -127,7 +127,7 @@ repos:
|
||||
# --- Structural & Reference Validation ---
|
||||
- id: check-unreferenced-labels
|
||||
name: "Check for unreferenced labels"
|
||||
entry: python ./tools/scripts/content/find_unreferenced_labels.py ./quarto/contents/core
|
||||
entry: python ./tools/scripts/content/check_unreferenced_labels.py ./quarto/contents/core
|
||||
language: python
|
||||
additional_dependencies: []
|
||||
pass_filenames: false
|
||||
@@ -135,7 +135,7 @@ repos:
|
||||
|
||||
- id: check-duplicate-labels
|
||||
name: "Check for duplicate labels"
|
||||
entry: python3 tools/scripts/content/find_duplicate_labels.py
|
||||
entry: python3 tools/scripts/content/check_duplicate_labels.py
|
||||
args: ['-d', 'quarto/contents/', '--figures', '--tables', '--listings', '--quiet', '--strict']
|
||||
language: system
|
||||
pass_filenames: false
|
||||
@@ -175,7 +175,7 @@ repos:
|
||||
# --- Image Validation ---
|
||||
- id: validate-images
|
||||
name: "Validate image files"
|
||||
entry: python tools/scripts/utilities/check_images.py
|
||||
entry: python tools/scripts/utilities/manage_images.py
|
||||
language: python
|
||||
additional_dependencies:
|
||||
- pillow
|
||||
@@ -185,7 +185,7 @@ repos:
|
||||
|
||||
- id: validate-external-images
|
||||
name: "Check for external images in Quarto files"
|
||||
entry: python3 tools/scripts/download_external_images.py --validate quarto/contents/
|
||||
entry: python3 tools/scripts/manage_external_images.py --validate quarto/contents/
|
||||
language: system
|
||||
pass_filenames: false
|
||||
files: ^quarto/contents/.*\.qmd$
|
||||
|
||||
@@ -1 +1 @@
|
||||
config/_quarto-html.yml
|
||||
config/_quarto-pdf.yml
|
||||
@@ -1353,7 +1353,7 @@ To build effective machine learning systems, we must first understand how differ
|
||||
|
||||
@fig-labels illustrates the common label types:
|
||||
|
||||
{#fig-labels width=90%}
|
||||
{#fig-labels width=90%}
|
||||
|
||||
The choice of label format depends heavily on our system requirements and resource constraints [@10.1109/ICRA.2017.7989092]. While classification labels might suffice for simple traffic counting, autonomous vehicles need detailed segmentation maps to make precise navigation decisions. Leading autonomous vehicle companies often maintain hybrid systems that store multiple label types for the same data, allowing flexible use across different applications.
|
||||
|
||||
|
||||
@@ -644,15 +644,15 @@ A useful example of this attack technique can be seen in a power analysis of a p
|
||||
|
||||
@fig-encryption shows the device's behavior when the correct password is entered. The red waveform captures the serial data stream, marking each byte as it is received. The blue curve records the device's power consumption over time. When the full, correct password is supplied, the power profile remains stable and consistent across all five bytes, providing a clear baseline for comparison with failed attempts.
|
||||
|
||||
{#fig-encryption}
|
||||
{#fig-encryption}
|
||||
|
||||
When an incorrect password is entered, the power analysis chart changes as shown in @fig-encryption2. In this case, the first three bytes (`0x61, 0x52, 0x77`) are correct, so the power patterns closely match the correct password up to that point. However, when the fourth byte (`0x42`) is processed and found to be incorrect, the device halts authentication. This change is reflected in the sudden jump in the blue power line, indicating that the device has stopped processing and entered an error state.
|
||||
|
||||
{#fig-encryption2}
|
||||
{#fig-encryption2}
|
||||
|
||||
@fig-encryption3 shows the case where the password is entirely incorrect (`0x30, 0x30, 0x30, 0x30, 0x30`). Here, the device detects the mismatch immediately after the first byte and halts processing much earlier. This is again visible in the power profile, where the blue line exhibits a sharp jump following the first byte, reflecting the device's early termination of authentication.
|
||||
|
||||
{#fig-encryption3}
|
||||
{#fig-encryption3}
|
||||
|
||||
These examples demonstrate how attackers can exploit observable power consumption differences to reduce the search space and eventually recover secret data through brute-force analysis. By systematically measuring power consumption patterns and correlating them with different inputs, attackers can extract sensitive information that should remain hidden.
|
||||
|
||||
|
||||
@@ -16,11 +16,11 @@ We are grateful for the academic support that has made it possible to hire teach
|
||||
|
||||
::: {layout-nrow=2}
|
||||
|
||||

|
||||

|
||||
|
||||

|
||||
|
||||

|
||||

|
||||
|
||||
:::
|
||||
|
||||
@@ -30,7 +30,7 @@ We gratefully acknowledge the support of the following non-profit organizations
|
||||
|
||||
::: {layout-nrow=1}
|
||||
|
||||

|
||||

|
||||
|
||||

|
||||
|
||||
@@ -42,7 +42,7 @@ The following companies contributed hardware kits used for the labs in this book
|
||||
|
||||
::: {layout-nrow=1}
|
||||
|
||||

|
||||

|
||||
|
||||

|
||||
|
||||
|
||||
@@ -43,7 +43,7 @@ In this KWS project, we will focus on Stage 1 (KWS or Keyword Spotting), where w
|
||||
The diagram below gives an idea of how the final KWS application should work (during inference):
|
||||
|
||||
\noindent
|
||||

|
||||

|
||||
|
||||
Our KWS application will recognize four classes of sound:
|
||||
|
||||
@@ -59,7 +59,7 @@ Our KWS application will recognize four classes of sound:
|
||||
The main component of the KWS application is its model. So, we must train such a model with our specific keywords, noise, and other words (the "unknown"):
|
||||
|
||||
\noindent
|
||||

|
||||

|
||||
|
||||
## Dataset {#sec-keyword-spotting-kws-dataset-7279}
|
||||
|
||||
@@ -168,12 +168,12 @@ The following step is to create the features to be trained in the next phase:
|
||||
We could keep the default parameter values, but we will use the DSP `Autotune parameters` option.
|
||||
|
||||
\noindent
|
||||

|
||||

|
||||
|
||||
We will take the `Raw features` (our 1-second, 16 KHz sampled audio data) and use the MFCC processing block to calculate the `Processed features`. For every 16,000 raw features (16,000 $\times$ 1 second), we will get 637 processed features $(13\times 49)$.
|
||||
|
||||
\noindent
|
||||
{width="90%" fig-align="center"}
|
||||
{width="90%" fig-align="center"}
|
||||
|
||||
The result shows that we only used a small amount of memory to pre-process data (16 KB) and a latency of 34 ms, which is excellent. For example, on an Arduino Nano (Cortex-M4f \@ 64 MHz), the same pre-process will take around 480 ms. The parameters chosen, such as the `FFT length` \[512\], will significantly impact the latency.
|
||||
|
||||
|
||||
@@ -282,7 +282,7 @@ So, for an FFT length of 32 points, the resulting output of the Spectral Analysi
|
||||
Once we understand what the pre-processing does, it is time to finish the job. So, let's take the raw data (time-series type) and convert it to tabular data. For that, go to the `Spectral Features` section on the `Parameters` tab, define the main parameters as discussed in the previous section (`[FFT]` with `[32]` points), and select`[Save Parameters]`:
|
||||
|
||||
\noindent
|
||||
{width="85%" fig-align="center"}
|
||||
{width="85%" fig-align="center"}
|
||||
|
||||
At the top menu, select the `Generate Features` option and the `Generate Features` button. Each 2-second window data will be converted into one data point of 63 features.
|
||||
|
||||
|
||||
@@ -6,7 +6,7 @@ The selected platforms are widely used in commercial applications, thereby ensur
|
||||
|
||||
## Our Featured Platform
|
||||
|
||||
{fig-align="center"}
|
||||
{fig-align="center"}
|
||||
|
||||
The [XIAOML Kit](https://www.seeedstudio.com/blog/2025/08/05/introducing-the-xiaoml-kit-your-tinyml-journey-starts-here/) is the most recent addition to our educational hardware platforms (released on July 31st, 2025). It offers a comprehensive TinyML development environment for learning about ML systems, featuring integrated wireless connectivity, a camera, multiple sensors, and extensive documentation. This compact board exemplifies how contemporary embedded systems can efficiently provide advanced machine learning capabilities within a cost-effective framework.
|
||||
|
||||
@@ -118,7 +118,7 @@ The XIAOML Kit excels at wireless connectivity and cost-sensitive deployments. I
|
||||
|
||||
The XIAO ESP32S3 represents the category of ultra-compact, wireless-enabled microcontrollers optimized for IoT applications. The name "XIAO" (小) translates to "tiny" in Chinese, reflecting the board's 21×17.5mm form factor.
|
||||
|
||||
{width=400}
|
||||
{width=400}
|
||||
|
||||
**Processor Architecture:**
|
||||
ESP32-S3 dual-core Xtensa LX7 running at 240MHz
|
||||
|
||||
@@ -278,7 +278,7 @@ Close the Upload Data window and return to the **Data acquisition** page. We can
|
||||
Classifying images is the most common application of deep learning, but a substantial amount of data is required to accomplish this task. We have around 50 images for each category. Is this number enough? Not at all! We will need thousands of images to "teach" or "model" each class, allowing us to differentiate them. However, we can resolve this issue by retraining a previously trained model using thousands of images. We refer to this technique as **"Transfer Learning" (TL)**. With TL, we can fine-tune a pre-trained image classification model on our data, achieving good performance even with relatively small image datasets, as in our case.
|
||||
|
||||
\noindent
|
||||
{width=85% fig-align="center"}
|
||||
{width=85% fig-align="center"}
|
||||
|
||||
With TL, we can fine-tune a pre-trained image classification model on our data, performing well even with relatively small image datasets (our case).
|
||||
|
||||
|
||||
@@ -1,368 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
GitHub Workflow Runs Cleanup Script
|
||||
|
||||
This script helps clean up old GitHub workflow runs while keeping a configurable
|
||||
number of recent runs per workflow. Useful for cleaning up failed debugging runs.
|
||||
|
||||
Usage:
|
||||
python cleanup_workflow_runs.py --help
|
||||
python cleanup_workflow_runs.py --dry-run
|
||||
python cleanup_workflow_runs.py --keep 5 --token YOUR_TOKEN
|
||||
python cleanup_workflow_runs.py --keep 10 --workflow "quarto-build.yml"
|
||||
|
||||
Requirements:
|
||||
- GitHub personal access token with 'actions:write' scope
|
||||
- Set token via --token flag or GITHUB_TOKEN environment variable
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
|
||||
import requests
|
||||
|
||||
|
||||
class GitHubWorkflowCleaner:
|
||||
"""Manages cleanup of GitHub workflow runs."""
|
||||
|
||||
def __init__(self, token: str, repo: str, dry_run: bool = False):
|
||||
"""
|
||||
Initialize the workflow cleaner.
|
||||
|
||||
Args:
|
||||
token: GitHub personal access token
|
||||
repo: Repository in format 'owner/repo'
|
||||
dry_run: If True, only preview actions without executing
|
||||
"""
|
||||
self.token = token
|
||||
self.repo = repo
|
||||
self.dry_run = dry_run
|
||||
self.base_url = "https://api.github.com"
|
||||
self.headers = {
|
||||
"Authorization": f"token {token}",
|
||||
"Accept": "application/vnd.github.v3+json",
|
||||
"User-Agent": "MLSysBook-Workflow-Cleaner"
|
||||
}
|
||||
|
||||
def _make_request(self, method: str, url: str, **kwargs) -> Optional[requests.Response]:
|
||||
"""Make a GitHub API request with error handling."""
|
||||
try:
|
||||
response = requests.request(method, url, headers=self.headers, **kwargs)
|
||||
if response.status_code == 403:
|
||||
# Check for rate limiting
|
||||
if 'X-RateLimit-Remaining' in response.headers:
|
||||
remaining = int(response.headers['X-RateLimit-Remaining'])
|
||||
if remaining == 0:
|
||||
reset_time = int(response.headers['X-RateLimit-Reset'])
|
||||
wait_time = reset_time - int(time.time()) + 1
|
||||
print(f"⚠️ Rate limit exceeded. Waiting {wait_time} seconds...")
|
||||
time.sleep(wait_time)
|
||||
return self._make_request(method, url, **kwargs)
|
||||
print(f"❌ Permission denied. Check your token has 'actions:write' scope.")
|
||||
return None
|
||||
elif response.status_code == 404:
|
||||
print(f"❌ Repository not found: {self.repo}")
|
||||
return None
|
||||
elif not response.ok:
|
||||
print(f"❌ API request failed: {response.status_code} - {response.text}")
|
||||
return None
|
||||
return response
|
||||
except requests.exceptions.RequestException as e:
|
||||
print(f"❌ Request failed: {e}")
|
||||
return None
|
||||
|
||||
def get_workflows(self) -> List[Dict]:
|
||||
"""Get all workflows in the repository."""
|
||||
url = f"{self.base_url}/repos/{self.repo}/actions/workflows"
|
||||
response = self._make_request("GET", url)
|
||||
if not response:
|
||||
return []
|
||||
|
||||
workflows = response.json().get('workflows', [])
|
||||
print(f"📋 Found {len(workflows)} workflows")
|
||||
return workflows
|
||||
|
||||
def get_workflow_runs(self, workflow_id: str, per_page: int = 100) -> List[Dict]:
|
||||
"""Get all runs for a specific workflow."""
|
||||
all_runs = []
|
||||
page = 1
|
||||
|
||||
while True:
|
||||
url = f"{self.base_url}/repos/{self.repo}/actions/workflows/{workflow_id}/runs"
|
||||
params = {
|
||||
"per_page": per_page,
|
||||
"page": page
|
||||
}
|
||||
|
||||
response = self._make_request("GET", url, params=params)
|
||||
if not response:
|
||||
break
|
||||
|
||||
data = response.json()
|
||||
runs = data.get('workflow_runs', [])
|
||||
|
||||
if not runs:
|
||||
break
|
||||
|
||||
all_runs.extend(runs)
|
||||
|
||||
# Check if we have more pages
|
||||
if len(runs) < per_page:
|
||||
break
|
||||
|
||||
page += 1
|
||||
|
||||
return all_runs
|
||||
|
||||
def delete_workflow_run(self, run_id: str) -> bool:
|
||||
"""Delete a specific workflow run."""
|
||||
if self.dry_run:
|
||||
return True
|
||||
|
||||
url = f"{self.base_url}/repos/{self.repo}/actions/runs/{run_id}"
|
||||
response = self._make_request("DELETE", url)
|
||||
return response is not None and response.status_code == 204
|
||||
|
||||
def clean_workflow_runs(self, keep_count: int = 5, workflow_filter: Optional[str] = None) -> Tuple[int, int]:
|
||||
"""
|
||||
Clean up old workflow runs.
|
||||
|
||||
Args:
|
||||
keep_count: Number of recent runs to keep per workflow
|
||||
workflow_filter: Optional workflow name to filter (e.g., 'quarto-build.yml')
|
||||
|
||||
Returns:
|
||||
Tuple of (total_runs_found, runs_to_delete)
|
||||
"""
|
||||
workflows = self.get_workflows()
|
||||
if not workflows:
|
||||
return 0, 0
|
||||
|
||||
total_runs = 0
|
||||
total_to_delete = 0
|
||||
|
||||
for workflow in workflows:
|
||||
workflow_name = workflow['name']
|
||||
workflow_path = workflow['path'].split('/')[-1] # Get filename
|
||||
workflow_id = workflow['id']
|
||||
|
||||
# Apply filter if specified
|
||||
if workflow_filter and workflow_filter not in [workflow_name, workflow_path]:
|
||||
continue
|
||||
|
||||
print(f"\n🔍 Processing workflow: {workflow_name} ({workflow_path})")
|
||||
|
||||
# Get all runs for this workflow
|
||||
runs = self.get_workflow_runs(workflow_id)
|
||||
total_runs += len(runs)
|
||||
|
||||
if len(runs) <= keep_count:
|
||||
print(f" ✅ Only {len(runs)} runs found, keeping all")
|
||||
continue
|
||||
|
||||
# Sort runs by creation date (newest first)
|
||||
runs.sort(key=lambda x: x['created_at'], reverse=True)
|
||||
|
||||
# Identify runs to delete (everything after keep_count)
|
||||
runs_to_keep = runs[:keep_count]
|
||||
runs_to_delete = runs[keep_count:]
|
||||
|
||||
print(f" 📊 Total runs: {len(runs)}")
|
||||
print(f" 📌 Keeping: {len(runs_to_keep)} most recent")
|
||||
print(f" 🗑️ To delete: {len(runs_to_delete)}")
|
||||
|
||||
if self.dry_run:
|
||||
print(f" 🔍 DRY RUN: Would delete {len(runs_to_delete)} runs")
|
||||
total_to_delete += len(runs_to_delete)
|
||||
continue
|
||||
|
||||
# Delete old runs
|
||||
deleted_count = 0
|
||||
for run in runs_to_delete:
|
||||
run_id = run['id']
|
||||
run_number = run['run_number']
|
||||
status = run['status']
|
||||
conclusion = run['conclusion']
|
||||
created_at = run['created_at']
|
||||
|
||||
print(f" 🗑️ Deleting run #{run_number} ({status}/{conclusion}) from {created_at}")
|
||||
|
||||
if self.delete_workflow_run(run_id):
|
||||
deleted_count += 1
|
||||
# Small delay to avoid overwhelming the API
|
||||
time.sleep(0.5)
|
||||
else:
|
||||
print(f" ❌ Failed to delete run #{run_number}")
|
||||
|
||||
total_to_delete += deleted_count
|
||||
print(f" ✅ Successfully deleted {deleted_count}/{len(runs_to_delete)} runs")
|
||||
|
||||
return total_runs, total_to_delete
|
||||
|
||||
def show_workflow_summary(self):
|
||||
"""Show a summary of all workflows and their run counts."""
|
||||
workflows = self.get_workflows()
|
||||
if not workflows:
|
||||
return
|
||||
|
||||
print(f"\n📊 Workflow Summary for {self.repo}")
|
||||
print("=" * 60)
|
||||
|
||||
total_runs = 0
|
||||
for workflow in workflows:
|
||||
workflow_name = workflow['name']
|
||||
workflow_path = workflow['path'].split('/')[-1]
|
||||
workflow_id = workflow['id']
|
||||
|
||||
runs = self.get_workflow_runs(workflow_id)
|
||||
run_count = len(runs)
|
||||
total_runs += run_count
|
||||
|
||||
# Count by status
|
||||
statuses = {}
|
||||
for run in runs:
|
||||
status = f"{run['status']}/{run.get('conclusion', 'N/A')}"
|
||||
statuses[status] = statuses.get(status, 0) + 1
|
||||
|
||||
print(f"{workflow_name} ({workflow_path}): {run_count} runs")
|
||||
for status, count in sorted(statuses.items()):
|
||||
print(f" - {status}: {count}")
|
||||
|
||||
print(f"\n📈 Total workflow runs across all workflows: {total_runs}")
|
||||
|
||||
|
||||
def get_repo_from_git() -> Optional[str]:
|
||||
"""Try to determine repository from git remote."""
|
||||
try:
|
||||
import subprocess
|
||||
result = subprocess.run(
|
||||
['git', 'remote', 'get-url', 'origin'],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=True
|
||||
)
|
||||
remote_url = result.stdout.strip()
|
||||
|
||||
# Parse GitHub URL
|
||||
if 'github.com' in remote_url:
|
||||
if remote_url.startswith('git@github.com:'):
|
||||
repo = remote_url.replace('git@github.com:', '').replace('.git', '')
|
||||
elif remote_url.startswith('https://github.com/'):
|
||||
repo = remote_url.replace('https://github.com/', '').replace('.git', '')
|
||||
else:
|
||||
return None
|
||||
return repo
|
||||
except:
|
||||
return None
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Clean up old GitHub workflow runs",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
# Show summary of all workflow runs
|
||||
python cleanup_workflow_runs.py --summary
|
||||
|
||||
# Dry run - see what would be deleted
|
||||
python cleanup_workflow_runs.py --dry-run --keep 5
|
||||
|
||||
# Clean up, keeping 10 most recent runs per workflow
|
||||
python cleanup_workflow_runs.py --keep 10
|
||||
|
||||
# Clean up specific workflow only
|
||||
python cleanup_workflow_runs.py --workflow "quarto-build.yml" --keep 3
|
||||
|
||||
Environment Variables:
|
||||
GITHUB_TOKEN - GitHub personal access token (alternative to --token)
|
||||
"""
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
'--token',
|
||||
help='GitHub personal access token (or set GITHUB_TOKEN env var)'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--repo',
|
||||
help='Repository in format owner/repo (auto-detected from git if not provided)'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--keep',
|
||||
type=int,
|
||||
default=5,
|
||||
help='Number of recent workflow runs to keep per workflow (default: 5)'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--workflow',
|
||||
help='Clean specific workflow only (by name or filename)'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--dry-run',
|
||||
action='store_true',
|
||||
help='Preview what would be deleted without actually deleting'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--summary',
|
||||
action='store_true',
|
||||
help='Show summary of workflow runs and exit'
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Get GitHub token
|
||||
token = args.token or os.getenv('GITHUB_TOKEN')
|
||||
if not token:
|
||||
print("❌ GitHub token required. Use --token flag or set GITHUB_TOKEN environment variable")
|
||||
print(" Generate token at: https://github.com/settings/tokens")
|
||||
print(" Required scopes: actions:write, repo")
|
||||
sys.exit(1)
|
||||
|
||||
# Get repository
|
||||
repo = args.repo or get_repo_from_git()
|
||||
if not repo:
|
||||
print("❌ Repository not specified and could not auto-detect from git")
|
||||
print(" Use --repo owner/repo or run from a git repository")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"🚀 GitHub Workflow Cleanup for {repo}")
|
||||
print(f" Token: {'*' * (len(token) - 4)}{token[-4:]}")
|
||||
print(f" Mode: {'DRY RUN' if args.dry_run else 'LIVE'}")
|
||||
|
||||
# Initialize cleaner
|
||||
cleaner = GitHubWorkflowCleaner(token, repo, args.dry_run)
|
||||
|
||||
if args.summary:
|
||||
cleaner.show_workflow_summary()
|
||||
return
|
||||
|
||||
# Clean workflow runs
|
||||
print(f"\n🧹 Starting cleanup (keeping {args.keep} runs per workflow)")
|
||||
if args.workflow:
|
||||
print(f" Filtering to workflow: {args.workflow}")
|
||||
|
||||
total_runs, deleted_runs = cleaner.clean_workflow_runs(
|
||||
keep_count=args.keep,
|
||||
workflow_filter=args.workflow
|
||||
)
|
||||
|
||||
print(f"\n📊 Cleanup Summary")
|
||||
print("=" * 40)
|
||||
print(f"Total workflow runs found: {total_runs}")
|
||||
if args.dry_run:
|
||||
print(f"Runs that would be deleted: {deleted_runs}")
|
||||
print("\n💡 Run without --dry-run to actually delete the runs")
|
||||
else:
|
||||
print(f"Runs successfully deleted: {deleted_runs}")
|
||||
print("✅ Cleanup completed!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,68 +0,0 @@
|
||||
import re
|
||||
import sys
|
||||
import os
|
||||
|
||||
def update_callouts(text):
|
||||
callout_with_title_pattern = re.compile(r':::\{.*?title=".*?".*?\}')
|
||||
callout_without_title_pattern = re.compile(
|
||||
r'(:::\{(?P<class_id>[^\n]+?)\})\n\n(?!<!--)(?P<header>#{1,6} (?P<title>[^\n]+))\n',
|
||||
re.MULTILINE
|
||||
)
|
||||
|
||||
def replacer(match):
|
||||
class_id = match.group('class_id').strip()
|
||||
title = match.group('title')
|
||||
updated_callout = f":::{{{class_id} title=\"{title}\"}}\n"
|
||||
return updated_callout
|
||||
|
||||
text_without_titled_callouts = callout_with_title_pattern.sub(lambda m: m.group(0), text)
|
||||
updated_text = callout_without_title_pattern.sub(replacer, text_without_titled_callouts)
|
||||
|
||||
return updated_text
|
||||
|
||||
def process_file(filepath):
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
|
||||
updated_content = update_callouts(content)
|
||||
|
||||
if content != updated_content:
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
f.write(updated_content)
|
||||
print(f"Updated: {filepath}")
|
||||
else:
|
||||
print(f"No changes: {filepath}")
|
||||
|
||||
def process_directory(directory):
|
||||
for root, _, files in os.walk(directory):
|
||||
for file in files:
|
||||
if file.endswith(".qmd"):
|
||||
filepath = os.path.join(root, file)
|
||||
process_file(filepath)
|
||||
|
||||
def print_usage():
|
||||
print("Usage:")
|
||||
print(" python3 fixtitle.py -d <directory> # Process all .qmd files in the directory recursively")
|
||||
print(" python3 fixtitle.py -f <file> # Process a single .qmd file")
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
if len(sys.argv) != 3:
|
||||
print_usage()
|
||||
|
||||
option, path = sys.argv[1], sys.argv[2]
|
||||
|
||||
if option == "-d":
|
||||
if not os.path.isdir(path):
|
||||
print(f"Error: '{path}' is not a valid directory.")
|
||||
sys.exit(1)
|
||||
process_directory(path)
|
||||
|
||||
elif option == "-f":
|
||||
if not os.path.isfile(path) or not path.endswith(".qmd"):
|
||||
print(f"Error: '{path}' is not a valid .qmd file.")
|
||||
sys.exit(1)
|
||||
process_file(path)
|
||||
|
||||
else:
|
||||
print_usage()
|
||||
Reference in New Issue
Block a user