Hugging Face Transformers's 72029-line llms.txt shows what thorough AI preparation looks like

🤗 Transformers provides a `Trainer` class optimized for training 🤗 Transformers models, making it easier to start training without manually writing your ow...

72,029

Lines

+4998% vs avg

3207

Sections

+13263% vs avg

742+

Companies

using llms.txt

Files

llms.txt

Visit Hugging Face Transformers View Raw llms.txt

Key Insights

Comprehensive structure

With 3207 distinct sections, this file provides thorough coverage for AI systems.

Comprehensive detail

72029 lines of thorough documentation for AI systems.

llms.txt Preview

First 100 lines of 72,029 total

View Full File

# Hyperparameter Search using Trainer API

🤗 Transformers provides a `Trainer` class optimized for training 🤗 Transformers models, making it easier to start training without manually writing your own training loop. The `Trainer` provides API for hyperparameter search. This doc shows how to enable it in example.

## Hyperparameter Search backend

`Trainer` supports four hyperparameter search backends currently:
[optuna](https://optuna.org/), [sigopt](https://sigopt.com/), [raytune](https://docs.ray.io/en/latest/tune/index.html) and [wandb](https://wandb.ai/site/sweeps).

you should install them before using them as the hyperparameter search backend
```bash
pip install optuna/sigopt/wandb/ray[tune]
```

## How to enable Hyperparameter search in example

Define the hyperparameter search space, different backends need different format.

For sigopt, see sigopt [object_parameter](https://docs.sigopt.com/ai-module-api-references/api_reference/objects/object_parameter), it's like following:
```py
>>> def sigopt_hp_space(trial):
...     return [
...         {"bounds": {"min": 1e-6, "max": 1e-4}, "name": "learning_rate", "type": "double"},
...         {
...             "categorical_values": ["16", "32", "64", "128"],
...             "name": "per_device_train_batch_size",
...             "type": "categorical",
...         },
...     ]
```

For optuna, see optuna [object_parameter](https://optuna.readthedocs.io/en/stable/tutorial/10_key_features/002_configurations.html#sphx-glr-tutorial-10-key-features-002-configurations-py), it's like following:

```py
>>> def optuna_hp_space(trial):
...     return {
...         "learning_rate": trial.suggest_float("learning_rate", 1e-6, 1e-4, log=True),
...         "per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [16, 32, 64, 128]),
...     }
```

Optuna provides multi-objective HPO. You can pass `direction` in `hyperparameter_search` and define your own compute_objective to return multiple objective values. The Pareto Front (`List[BestRun]`) will be returned in hyperparameter_search, you should refer to the test case `TrainerHyperParameterMultiObjectOptunaIntegrationTest` in [test_trainer](https://github.com/huggingface/transformers/blob/main/tests/trainer/test_trainer.py). It's like following

```py
>>> best_trials = trainer.hyperparameter_search(
...     direction=["minimize", "maximize"],
...     backend="optuna",
...     hp_space=optuna_hp_space,
...     n_trials=20,
...     compute_objective=compute_objective,
... )
```

For raytune, see raytune [object_parameter](https://docs.ray.io/en/latest/tune/api/search_space.html), it's like following:

```py
>>> def ray_hp_space(trial):
...     return {
...         "learning_rate": tune.loguniform(1e-6, 1e-4),
...         "per_device_train_batch_size": tune.choice([16, 32, 64, 128]),
...     }
```

For wandb, see wandb [object_parameter](https://docs.wandb.ai/guides/sweeps/configuration), it's like following:

```py
>>> def wandb_hp_space(trial):
...     return {
...         "method": "random",
...         "metric": {"name": "objective", "goal": "minimize"},
...         "parameters": {
...             "learning_rate": {"distribution": "uniform", "min": 1e-6, "max": 1e-4},
...             "per_device_train_batch_size": {"values": [16, 32, 64, 128]},
...         },
...     }
```

Define a `model_init` function and pass it to the `Trainer`, as an example:
```py
>>> def model_init(trial):
...     return AutoModelForSequenceClassification.from_pretrained(
...         model_args.model_name_or_path,
...         from_tf=bool(".ckpt" in model_args.model_name_or_path),
...         config=config,
...         cache_dir=model_args.cache_dir,
...         revision=model_args.model_revision,
...         token=True if model_args.use_auth_token else None,
...     )
```

Create a `Trainer` with your `model_init` function, training arguments, training and test datasets, and evaluation function:

```py
>>> trainer = Trainer(
...     model=None,
...     args=training_args,
...     train_dataset=small_train_dataset,
...     eval_dataset=small_eval_dataset,
...     compute_metrics=compute_metrics,
...     processing_class=tokenizer,

View Complete File (72,029 lines)

Hugging Face Transformers is ready for AI search. Are you?

Join 742+ companies preparing their websites for the future of search. Create your llms.txt file in minutes.

Generate Your llms.txt

More Examples to Explore

View All

Don't get left behind

Your competitors are preparing for AI search.

Hugging Face Transformers has 3207 organized sections ready for AI crawlers. Generate your llms.txt file and join the companies optimizing for the future of search.

Get Started Free Browse More Examples