Official PyTorch Reading

This page is the research spine behind the docs set. All links below point to official PyTorch documentation.

Distributed Training Core

Topic	Why it matters
DistributedDataParallel	Current DDP behavior, including the fact that DDP does not shard inputs for you.
Distributed Data Parallel Notes	Deeper background on how DDP works.
FullyShardedDataParallel	Current FSDP surface and sharding model.
Distributed Overview	Reference map for the larger distributed stack.

Topic	Why it matters
`torchrun`	One-process-per-GPU launch model and current `--local-rank` behavior.
NCCL environment variables	Useful when discussing operational debugging and communication failures.

Topic	Why it matters
`torch.utils.data`	`DataLoader`, `DistributedSampler`, `prefetch_factor`, `persistent_workers`, and `set_epoch`.

Topic	Why it matters
Distributed Checkpoint	Multi-rank checkpointing and load-time resharding.
Distributed Checkpoint Recipe	Practical implementation guidance.

Topic	Why it matters
AMP	Current mixed-precision guidance.
Activation checkpointing	Memory/compute tradeoff and RNG-state implications.
`torch.compile` end-to-end tutorial	Useful if you want one more modern optimization angle in the discussion.

External references for the biotech-specific sections of this site. These are not PyTorch docs, but they are the standard references for the domain.

Topic	Why it matters
ESM protein language models	Reference implementation for large-scale protein sequence pretraining.
PyTorch Geometric DataLoader	Graph-level batching for molecular GNNs; the `batch` vector semantics.
AlphaFold 2 technical report	Source for pair representation design and Evoformer memory characteristics.
Megatron-LM sequence parallelism	The communication pattern behind sequence-parallel attention.
EGNN equivariant network	Practical SE(3)-equivariant message passing for molecular graphs.
MoleculeNet splits	Scaffold, random, and scaffold-stratified split definitions used in benchmarking.
BEDROC enrichment metric	Why EF@1% and BEDROC matter more than AUROC for virtual screening evaluation.