OSError: [Errno 30] Cannot create directory '/efs'. Detail: [errno 30] Read-only file system #57

New Issue

GiteaMirror · 2025-11-02T00:02:42-05:00

GiteaMirror commented

2025-11-02 00:02:42 -05:00

Originally created by @bhavya-giri on GitHub (Oct 26, 2023).

@GokuMohandas can you help me figure this out

Originally created by @bhavya-giri on GitHub (Oct 26, 2023). @GokuMohandas can you help me figure this out

GiteaMirror commented

2025-11-02 00:02:43 -05:00

@bhavya-giri commented on GitHub (Oct 30, 2023):

@bhavya-giri commented on GitHub (Oct 30, 2023): <img width="1414" alt="Screenshot 2023-10-30 at 7 58 46 AM" src="https://github.com/GokuMohandas/Made-With-ML/assets/102273412/1669d9be-f8c6-4201-9f7f-0769a0efeeb2"> <img width="1414" alt="Screenshot 2023-10-30 at 7 59 08 AM" src="https://github.com/GokuMohandas/Made-With-ML/assets/102273412/9a57482d-1cb3-4f80-9cd5-204f5f101413"> <img width="1414" alt="Screenshot 2023-10-30 at 7 59 18 AM" src="https://github.com/GokuMohandas/Made-With-ML/assets/102273412/ff9aade6-d574-404f-9178-e3ec8b8a2a36">

GiteaMirror commented

2025-11-02 00:02:43 -05:00

@Meryl-Fang commented on GitHub (Nov 12, 2023):

same error here, have you managed to resolve it?

@Meryl-Fang commented on GitHub (Nov 12, 2023): same error here, have you managed to resolve it?

GiteaMirror commented

2025-11-02 00:02:43 -05:00

@taaha commented on GitHub (Nov 12, 2023):

I am having the same issue and no idea why. basically it is unable to load function from madewilml/data directory. A hack that worked for me is to create and run the following code cell above this erroneous code cell

import re
from typing import Dict, List, Tuple

import numpy as np
import pandas as pd
import ray
from ray.data import Dataset
from sklearn.model_selection import train_test_split
from transformers import BertTokenizer

def stratify_split(
    ds: Dataset,
    stratify: str,
    test_size: float,
    shuffle: bool = True,
    seed: int = 1234,
) -> Tuple[Dataset, Dataset]:
    """Split a dataset into train and test splits with equal
    amounts of data points from each class in the column we
    want to stratify on.

    Args:
        ds (Dataset): Input dataset to split.
        stratify (str): Name of column to split on.
        test_size (float): Proportion of dataset to split for test set.
        shuffle (bool, optional): whether to shuffle the dataset. Defaults to True.
        seed (int, optional): seed for shuffling. Defaults to 1234.

    Returns:
        Tuple[Dataset, Dataset]: the stratified train and test datasets.
    """

    def _add_split(df: pd.DataFrame) -> pd.DataFrame:  # pragma: no cover, used in parent function
        """Naively split a dataframe into train and test splits.
        Add a column specifying whether it's the train or test split."""
        train, test = train_test_split(df, test_size=test_size, shuffle=shuffle, random_state=seed)
        train["_split"] = "train"
        test["_split"] = "test"
        return pd.concat([train, test])

    def _filter_split(df: pd.DataFrame, split: str) -> pd.DataFrame:  # pragma: no cover, used in parent function
        """Filter by data points that match the split column's value
        and return the dataframe with the _split column dropped."""
        return df[df["_split"] == split].drop("_split", axis=1)

    # Train, test split with stratify
    grouped = ds.groupby(stratify).map_groups(_add_split, batch_format="pandas")  # group by each unique value in the column we want to stratify on
    train_ds = grouped.map_batches(_filter_split, fn_kwargs={"split": "train"}, batch_format="pandas")  # combine
    test_ds = grouped.map_batches(_filter_split, fn_kwargs={"split": "test"}, batch_format="pandas")  # combine

    # Shuffle each split (required)
    train_ds = train_ds.random_shuffle(seed=seed)
    test_ds = test_ds.random_shuffle(seed=seed)

    return train_ds, test_ds

Basically instead of importing it which it is failing to do so (no idea why) we are directly using the function in the notebook

@taaha commented on GitHub (Nov 12, 2023): I am having the same issue and no idea why. basically it is unable to load function from madewilml/data directory. A hack that worked for me is to create and run the following code cell above this erroneous code cell ``` import re from typing import Dict, List, Tuple import numpy as np import pandas as pd import ray from ray.data import Dataset from sklearn.model_selection import train_test_split from transformers import BertTokenizer def stratify_split( ds: Dataset, stratify: str, test_size: float, shuffle: bool = True, seed: int = 1234, ) -> Tuple[Dataset, Dataset]: """Split a dataset into train and test splits with equal amounts of data points from each class in the column we want to stratify on. Args: ds (Dataset): Input dataset to split. stratify (str): Name of column to split on. test_size (float): Proportion of dataset to split for test set. shuffle (bool, optional): whether to shuffle the dataset. Defaults to True. seed (int, optional): seed for shuffling. Defaults to 1234. Returns: Tuple[Dataset, Dataset]: the stratified train and test datasets. """ def _add_split(df: pd.DataFrame) -> pd.DataFrame: # pragma: no cover, used in parent function """Naively split a dataframe into train and test splits. Add a column specifying whether it's the train or test split.""" train, test = train_test_split(df, test_size=test_size, shuffle=shuffle, random_state=seed) train["_split"] = "train" test["_split"] = "test" return pd.concat([train, test]) def _filter_split(df: pd.DataFrame, split: str) -> pd.DataFrame: # pragma: no cover, used in parent function """Filter by data points that match the split column's value and return the dataframe with the _split column dropped.""" return df[df["_split"] == split].drop("_split", axis=1) # Train, test split with stratify grouped = ds.groupby(stratify).map_groups(_add_split, batch_format="pandas") # group by each unique value in the column we want to stratify on train_ds = grouped.map_batches(_filter_split, fn_kwargs={"split": "train"}, batch_format="pandas") # combine test_ds = grouped.map_batches(_filter_split, fn_kwargs={"split": "test"}, batch_format="pandas") # combine # Shuffle each split (required) train_ds = train_ds.random_shuffle(seed=seed) test_ds = test_ds.random_shuffle(seed=seed) return train_ds, test_ds ``` Basically instead of importing it which it is failing to do so (no idea why) we are directly using the function in the notebook

GiteaMirror commented

2025-11-02 00:02:44 -05:00

@bhavya-giri commented on GitHub (Nov 12, 2023):

But the same error would come in training, check this repo https://github.com/GokuMohandas/mlops-course

@bhavya-giri commented on GitHub (Nov 12, 2023): But the same error would come in training, check this repo https://github.com/GokuMohandas/mlops-course

GiteaMirror commented

2025-11-02 00:02:45 -05:00

@gOsuzu commented on GitHub (Dec 3, 2023):

As the error message indicated, this error caused by the permission related to /efs folder, you are creating.
I assume you use your own local machine. I edited like below, and it worked in my local environment, Mac OS (14.1.2) and Python 3.10.11. The path would be different, depending on where your directory located. I hope this might help you.

config.py
Change line 13:
EFS_DIR = Path(f"/Users/<your_user_name>/efs/shared_storage/madewithml/{os.environ.get('GITHUB_USERNAME', '')}")
madewithml.ipynb
Change the codes in Setup section:
EFS_DIR = f"/Users/<your_user_name>/efs/shared_storage/madewithml/{os.environ['GITHUB_USERNAME']}"

@gOsuzu commented on GitHub (Dec 3, 2023): As the error message indicated, this error caused by the permission related to `/efs` folder, you are creating. I assume you use your own local machine. I edited like below, and it worked in my local environment, Mac OS (14.1.2) and Python 3.10.11. The path would be different, depending on where your directory located. I hope this might help you. 1. `config.py` Change line 13: `EFS_DIR = Path(f"/Users/<your_user_name>/efs/shared_storage/madewithml/{os.environ.get('GITHUB_USERNAME', '')}") ` 2. `madewithml.ipynb` Change the codes in Setup section: `EFS_DIR = f"/Users/<your_user_name>/efs/shared_storage/madewithml/{os.environ['GITHUB_USERNAME']}"`

GiteaMirror referenced this issue

2025-11-02 00:05:32 -05:00

[PR #57] [MERGED] updated readme #131

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/Made-With-ML#57