Publishing and Discovering Guides

We've created step-by-step guides to help you discover data and publish your own with MDF.

Publish Data

Sign up and Join

To use MDF, you'll need a Globus account. We use Globus to transfer the data from your device to anyone who wants to access it. Creating an account is free and easy. Once you create your account, you can publish data on MDF.

Collect

Collect data, preferably in openly accessible formats.

MDF Supports data collection from: local computer, Globus endpoints, Google Drive, and HTTPS accessible data (e.g., GitHub or group web server)

Read about What Makes a Good Dataset. We provide a checklist for what you need with an explanation on why it matters.

Publish

You can publish through our website or use MDF Forge to publish through our SDK.

Note: If your data is structured and ready to be loaded into a DataFrame with Python, check out Foundry-ML datasets on MDF. These are structured ML-ready datasets that can be accessed programmatically.

Quickstart Videos mdi-laptop Publish via Local mdi-cloud Publish via Globus mdi-google Publish via Google Drive mdi-pen Easier Form Completion

  • No additional permissions are required to share data from Globus endpoints
  • Paths with `~` characters are invalid, full paths must be used
  • Copy the link to your data with the `Get Link` button in the Globus web app
  • Add this link as a data location

  • Share the data path (read access) with materialsdatafacility@gmail.com
  • Add google:///my_path as a data location (note triple `/`)

Discover Data

Web Interface

Browse published datasets through a web user interface. You can search, discover, and download data just using our website.

Accessing data is as easy as:
  1. Click the "Get Data" button on the dataset page for the data you're interested in. This will take you to the data on Globus. We use Globus to transfer data. You don't need a Globus account to access and download data!
  2. Select the checkbox next to the file you want to download. Only one file at a time! And no folders.
  3. Click the “Download” button.
Ready to get going with our web interface?
Browse Published Data

Programmatic Interface

Explore datasets through a Python interface either with MDF Forge or Foundry-ML.

What's the difference?

Foundry-ML datasets have a required structure that allows them to load directly into a DataFrame and be immediately ready to use in a Python environment. To use Foundry-ML, follow the steps for loading data from any of our example notebooks, documentation, or a specific dataset's page.

MDF Forge allows you to load datasets programmatically, but the datasets aren't configured to be used in a Python environment right away. If you'd like to use them in a Python environment, that may require a little more work on your end since we don't have a required structure for those datasets. To use MDF Forge, check out the documentation or the GitHub Repo.