Primal and dual AC/DC/SOC-OPF solutions for power grids with <1000 buses.
AI & ML interests
None defined yet.
Recent Activity
Organization Card
PGLearn Datasets
This HuggingFace organization hosts the PGLearn datasets, described in https://arxiv.org/abs/2505.22825.
| Collection | Link | #feasible | #samples |
|---|---|---|---|
| Small (#buses ≤ 1000) | https://huggingface.co/collections/PGLearn/pglearn-small | ~5.749M | 6M |
| Medium (#buses ≤ 5000) | https://huggingface.co/collections/PGLearn/pglearn-medium | ~1.573M | 1.75M |
| Large (#buses ≤ 10000) | https://huggingface.co/collections/PGLearn/pglearn-large | ~253.6K | 300K |
| Extra-Large (#buses > 10000) | https://huggingface.co/collections/PGLearn/pglearn-extralarge | ~69.9K | 75K |
| N-1 contingency cases | https://huggingface.co/collections/PGLearn/pglearn-n-1 | ~3.575M | 4.2M |
Citation
@article{klamkin2025pglearn,
title={PGLearn--An Open-Source Learning Toolkit for Optimal Power Flow},
author={Klamkin, Michael and Tanneau, Mathieu and Van Hentenryck, Pascal},
journal={arXiv preprint arXiv:2505.22825},
year={2025}
}
Instructions
Click to open usage instructions.
The datasets are available in two formats: Parquet and HDF5.
- Parquet: Use the HuggingFace datasets package as usual, see their documentation for further instructions.
- HDF5: Use the
snapshot_downloadfunction from huggingface_hub:
from huggingface_hub import snapshot_download
snapshot_download(
"PGLearn/PGLearn-Small-14_ieee",
repo_type="dataset",
local_dir="./14_ieee", # where to put it
revision="script", # IMPORTANT: grab the HDF5 files, not the parquet files
# you can set filters, e.g. if we only want the DC samples:
allow_patterns=[
"*/DCOPF/*", "*input*", "case.json.gz", "config.toml",
],
ignore_patterns=[
"infeasible/*"
],
)
Then, you can load them with h5py. Note that for some large cases, the dual solution data had to be split up into multiple files. The below helper can decompress and reconstruct these files:
from pathlib import Path
import gzip, shutil
def open_maybe_gzip_cat(path: str | list):
if isinstance(path, list):
dest = Path(path[0]).parent.with_suffix(".h5")
if not dest.exists():
with open(dest, "wb") as dest_f:
for piece in path:
with open(piece, "rb") as piece_f:
shutil.copyfileobj(piece_f, dest_f)
shutil.rmtree(Path(piece).parent)
path = dest.as_posix()
return gzip.open(path, "rb") if path.endswith(".gz") else open(path, "rb")
primal = h5py.File(open_maybe_gzip_cat("data/SOCOPF/primal.h5.gz"))
dual = h5py.File(open_maybe_gzip_cat(
["data/SOCOPF/dual/xaa","data/SOCOPF/dual/xab","data/SOCOPF/dual/xac"]
), "r")
If using the HDF5 dataset more than once, it is recommended to pre-decompress the files. The following helper can do this:
from pathlib import Path
import gzip, shutil
for src in Path("./14_ieee").rglob("*.h5.gz"):
dest = src.with_suffix("")
with gzip.open(src, "rb") as fsrc, open(dest, "wb") as fdest:
shutil.copyfileobj(fsrc, fdest)
src.unlink() # optional; delete the compressed files
models 0
None public yet
datasets 30
PGLearn/PGLearn-Medium-2869_pegase-nminus1
Viewer • Updated • 187k • 239
PGLearn/PGLearn-Large-9241_pegase-nminus1
Viewer • Updated • 69.5k • 224
PGLearn/PGLearn-ExtraLarge-13659_pegase-nminus1
Viewer • Updated • 22.1k • 120
PGLearn/PGLearn-Large-6470_rte-nminus1
Viewer • Updated • 83.4k • 87
PGLearn/PGLearn-Medium-NewYork2030-nminus1
Viewer • Updated • 250k • 156
PGLearn/PGLearn-Medium-1888_rte-nminus1
Viewer • Updated • 205k • 186
PGLearn/PGLearn-Medium-1354_pegase-nminus1
Viewer • Updated • 217k • 93
PGLearn/PGLearn-Small-300_ieee-nminus1
Viewer • Updated • 420k • 235
PGLearn/PGLearn-Small-118_ieee-nminus1
Viewer • Updated • 484k • 363
PGLearn/PGLearn-Small-89_pegase-nminus1
Viewer • Updated • 435k • 183