configs
Configs
π― Config-Driven Experimentation
Metaflow's config-driven experimentation lets you separate experiment configuration (parameters, hyperparameters, environment variables, etc.) from flow code. This makes it easier to:
- Run many experiments with different configs
- Share, version, and reproduce experiments
- Keep flow code clean and DRY
- Support team-wide configuration standards
π§© Key Concepts
1. Config Files
You define configs in external files (usually .yaml or .json):
# example_config.yaml
lr: 0.01
dropout: 0.3
epochs: 5
2. Loading Configs
Use the from_config() helper to load these values into a flow:
from metaflow import FlowSpec, step, from_config
config = from_config("example_config.yaml")
class MyFlow(FlowSpec):
@step
def start(self):
self.lr = config["lr"]
...
3. Combining with Parameters
You can still define @Parameter and override values via CLI:
python my_flow.py run --lr 0.05
Use config.get("lr", self.lr) to give CLI overrides precedence.
βοΈ How It Works Under the Hood
- Configs are loaded at runtime using
from_config() - Configs can be Python modules, YAML, or JSON
- Values are treated as regular Python dict entries
- You can load multiple config files and merge them
π§ Advanced Features
| Feature | Description |
|---|---|
| Multiple config files | Combine and override values across sources |
| Dynamic configs | Load Python-based config logic (e.g., environment-aware) |
| Experiment tracking | Combine configs with Metaflow tags/metadata/cards |
| YAML/JSON support | Native parsing of structured config files |
π― Basic configs
Metaflow provides a configuration system to control behavior globally or per environment, enabling you to customize how flows run, store data, use compute resources, and more β all without modifying your code.
π Where Configurations Come From
Configurations can be set in four layers (in priority order):
| Layer | Example | Priority |
|---|---|---|
| 1. Runtime overrides | CLI: METAFLOW_DATASTORE=s3 python flow.py run |
πΊ Highest |
| 2. Environment files | .metaflowconfig/config.json |
|
| 3. Python code | from metaflow import config |
|
| 4. Built-in defaults | Metaflow internal settings | π» Lowest |
π§ How to Set Configs
β Option 1: Runtime environment variables (most common)
export METAFLOW_PROFILE=my-aws-profile
export METAFLOW_DATASTORE=s3
β
Option 2: Config file (.metaflowconfig/config.json)
{
"METAFLOW_PROFILE": "default",
"METAFLOW_DATASTORE": "s3",
"METAFLOW_S3ROOT": "s3://my-bucket/metaflow"
}
β Option 3: In Python code (not recommended for shared/team flows)
from metaflow import config
print(config.METAFLOW_PROFILE)
β Summary Benefits
- π§ Configure flows without hardcoding
- π§ͺ Make your code environment-agnostic
- π¦ Simplifies running flows in different profiles (e.g., dev vs prod)
- π© Works seamlessly with cloud deployments
π .metaflowconfig/config.json β TEMPLATE
You can place this file in your project root or your home directory under ~/.metaflowconfig/config.json.
β Local & AWS Profile Example
{
"default": {
"METAFLOW_PROFILE": "local",
"METAFLOW_DATASTORE": "local",
"METAFLOW_DEFAULT_METADATA": "local",
"METAFLOW_CARD_DIR": "./_cards"
},
"aws": {
"METAFLOW_PROFILE": "aws",
"METAFLOW_DATASTORE": "s3",
"METAFLOW_S3ROOT": "s3://your-bucket/metaflow",
"METAFLOW_DEFAULT_METADATA": "service",
"METAFLOW_SERVICE_URL": "https://your-metaflow-metadata-service.com",
"METAFLOW_CARD_DIR": "./_cards",
"METAFLOW_CARD_VIEWER": "https://your-metaflow-ui/cards"
}
}
π This allows you to switch between profiles using:
export METAFLOW_PROFILE=default # for local
export METAFLOW_PROFILE=aws # for cloud
π§ͺ Test it!
Try running:
export METAFLOW_PROFILE=aws
python my_flow.py run
Then switch back:
export METAFLOW_PROFILE=default
python my_flow.py run
Each profile will:
- Use the correct artifact storage (local folder vs. S3)
- Store metadata locally or in a shared service
- Send cards to the appropriate viewer
π οΈ Useful Configuration Keys
| Key | Description |
|---|---|
METAFLOW_PROFILE |
Profile name (default, aws, etc.) |
METAFLOW_DATASTORE |
local or s3 |
METAFLOW_S3ROOT |
S3 root path for all artifacts |
METAFLOW_DEFAULT_METADATA |
local or service |
METAFLOW_SERVICE_URL |
URL to your metadata service |
METAFLOW_CARD_DIR |
Directory where local cards are stored |
METAFLOW_CARD_VIEWER |
Optional external URL for UI card viewing |
π§ Pro Tips
- Keep
.metaflowconfig/config.jsonin source control (but never hardcode credentials) - Store secrets like AWS credentials in
~/.aws/credentialsor IAM roles - You can add more profiles like
staging,dev,prod, etc. - Use
metaflow metadata getandstatusto confirm the correct setup
π οΈ Parsing Configs in Metaflow
π§© Core Tool: from_config()
Metaflow provides a utility:
from metaflow import from_config
This function loads a config file into a Python dictionary. The file can be:
.yamlor.yml.json.py(Python config file)
π§ͺ Example Usage
YAML file: config.yaml
learning_rate: 0.01
dropout: 0.3
Flow code:
from metaflow import FlowSpec, step, from_config
config = from_config("config.yaml")
class MyFlow(FlowSpec):
@step
def start(self):
self.lr = config["learning_rate"]
print("LR:", self.lr)
self.next(self.end)
@step
def end(self):
pass
π§ Supported Formats & Behavior
| Format | Notes |
|---|---|
| YAML | Parsed with PyYAML (must be installed) |
| JSON | Parsed with Python json module |
| Python | Must define a CONFIG dict |
| .py file example: |
# config.py
CONFIG = {
"learning_rate": 0.1,
"epochs": 5
}
Then use:
config = from_config("config.py")
π‘οΈ Safety Notes
- Python config files are executed, so use with caution (donβt load untrusted files).
- If config is missing or malformed,
from_config()raises an error.
π οΈ Custom Parsers in Metaflow
Custom parsers allow you to extend Metaflowβs configuration system by defining your own rules for reading configuration files. This is useful when:
- Your configuration format is non-standard or specialized.
- You need custom preprocessing before the configuration values are used.
- You want to integrate with legacy systems or non-YAML/JSON formats.
π§© Core Concepts
-
Parser Interface:
Metaflow expects parsers to adhere to a standard interface. Custom parsers are classes that implement methods for reading and parsing configuration files. -
Registration:
Your custom parser must be registered with Metaflow so thatfrom_config()can detect and use it based on the file extension (or other heuristics). -
Extensibility:
By writing your own parser, you can:- Handle new file extensions.
- Preprocess configuration data (e.g., environment variable substitution, validation).
- Merge multiple config files or sources in a custom manner.
π§ Implementation Overview
-
Define Your Parser:
Create a class that typically inherits from a base parser provided by Metaflow (or implements the necessary interface). Implement at least aparse()method that accepts a file path and returns a Python dictionary. -
Register the Parser:
Add your parser to Metaflowβs parser registry. This is often done by appending your parser (or its file extension mapping) to an internal list. Metaflow then knows to use your custom parser when encountering a file with the associated extension. -
Use with
from_config():
Once registered, you can load your configuration file usingfrom_config(), and Metaflow will automatically invoke your custom parser if the file type matches.
π Example Flow (Conceptual)
from metaflow import from_config, FlowSpec, step
from my_custom_parser import MyCustomParser # Your custom parser class
# Ensure your parser is registered with Metaflow
# (This registration might be handled in your parser module)
MyCustomParser.register()
# Load configuration using your custom parser
config = from_config("config.myext") # 'myext' is the custom file extension
class MyFlow(FlowSpec):
@step
def start(self):
self.param = config["some_parameter"]
print("Loaded parameter:", self.param)
self.next(self.end)
@step
def end(self):
print("Flow complete.")
Note: The above is a conceptual example. The actual implementation may require following Metaflowβs custom parser API closely.
β Benefits
- Flexibility: Integrate any configuration format you need.
- Customization: Preprocess and validate configurations exactly as required.
- Seamless Integration: Once set up, your custom parser works transparently with Metaflowβs
from_config()utility.