Installation
Requirements
CausalFM Toolkit requires:
Python 3.9 or higher
PyTorch 2.0 or higher
CUDA-capable GPU (optional, but recommended for training)
Python Dependencies
The main dependencies include:
torch>=2.0.0- Deep learning frameworknumpy>=1.24.0- Numerical computingpandas>=2.0.0- Data manipulationnetworkx>=3.0- Graph operations for DAG constructionscipy>=1.10.0- Scientific computingscikit-learn>=1.3.0- Machine learning utilitiestqdm- Progress barstensorboard- Training visualization
Install from Source
Step 1: Clone the Repository
git clone https://github.com/yccm/CausalFM-toolkit.git
cd CausalFM-toolkit
Step 2: Create Environment
Using Conda (Recommended):
conda create -n causalfm python=3.10
conda activate causalfm
Using venv:
python -m venv causalfm_env
source causalfm_env/bin/activate # On Windows: causalfm_env\Scripts\activate
Step 3: Install Dependencies
pip install -r requirements.txt
GPU Support
For GPU acceleration, ensure you have:
CUDA Toolkit installed (version 11.8 or higher recommended)
cuDNN library
PyTorch with CUDA support
To install PyTorch with CUDA 11.8:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
For other CUDA versions, visit: https://pytorch.org/get-started/locally/
Verify Installation
Run the following to verify your installation:
import causalfm
import torch
print(f"CausalFM version: {causalfm.__version__}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
# Test basic import
from causalfm.data import StandardCATEGenerator
from causalfm.models import StandardCATEModel
from causalfm.training import StandardCATETrainer
from causalfm.evaluation import compute_pehe
print("✅ All imports successful!")
Test Data Generation
Quick test to ensure data generation works:
from causalfm.data import StandardCATEGenerator
import numpy as np
# Generate a small test dataset
generator = StandardCATEGenerator(num_samples=100, num_features=5, seed=42)
df = generator.generate()
print(f"Generated dataset shape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print("✅ Data generation working!")
Troubleshooting
Import Errors
If you encounter ModuleNotFoundError for tabpfn:
The src/tabpfn directory is part of this repository and should be automatically
accessible. Ensure you’re running Python from the repository root directory.
CUDA Out of Memory
If you encounter CUDA out of memory errors during training:
Reduce batch size in
TrainingConfigUse
num_workers=0to disable multiprocessingConsider using CPU for small experiments
config = TrainingConfig(
batch_size=8, # Reduce from default 16
num_workers=0,
device='cpu' # Or 'cuda:0' for GPU
)
Multiprocessing Issues (macOS/Windows)
If you see RuntimeError about multiprocessing, wrap your training code in:
if __name__ == '__main__':
# Your training code here
trainer.train()
Or set num_workers=0 in TrainingConfig.
Development Installation
For development and contributing:
git clone https://github.com/yccm/CausalFM-toolkit.git
cd CausalFM-toolkit
pip install -e . # Editable install
pip install -r requirements-dev.txt # Development dependencies
This allows you to modify the source code and see changes immediately.
Next Steps ———-After installation, check out the Quick Start guide to learn how to use CausalFM Toolkit.