We've created step-by-step guides to help you discover data and publish your own with MDF.
To use MDF, you'll need a Globus account. We use Globus to transfer the data from your device to anyone who wants to access it. Creating an account is free and easy. Once you create your account, you can publish data on MDF.
Once you have a Globus account, join the group below to publish data!
Collect data, preferably in openly accessible formats.
MDF Supports data collection from: local computer, Globus endpoints, Google Drive, and HTTPS accessible data (e.g., GitHub or group web server)
Read about What Makes a Good Dataset. We provide a checklist for what you need with an explanation on why it matters.
You can publish through our website or use MDF Forge to publish through our SDK.
Note: If your data is structured and ready to be loaded into a DataFrame with Python, check out Foundry-ML datasets on MDF. These are structured ML-ready datasets that can be accessed programmatically.
Browse published datasets through a web user interface. You can search, discover, and download data just using our website.
Explore datasets through a Python interface either with MDF Forge or Foundry-ML.
Foundry-ML datasets have a required structure that allows them to load directly into a DataFrame and be immediately ready to use in a Python environment. To use Foundry-ML, follow the steps for loading data from any of our example notebooks, documentation, or a specific dataset's page.
MDF Forge allows you to load datasets programmatically, but the datasets aren't configured to be used in a Python environment right away. If you'd like to use them in a Python environment, that may require a little more work on your end since we don't have a required structure for those datasets. To use MDF Forge, check out the documentation or the GitHub Repo.