Large Scale Training in Vertex AI
The major difference between TF serving and Vertex AI during large scale training is directory. Instead of saving model in local, save model in AIP_MODEL_DIR
variable with directory that Vertex AI provided.
For specific codes, see ‘Go for Codes’.
Hyperparameter Tuning in Vertex AI
To search best hyperparameter comnbination in Vertex AI, follow the following steps.
- Use
argparse
library to get hyperparameter value as an argument. - Use mirrored strategy to perform trials in multi GPU machine.
- Use
hypertune
library to report performance of model to Vertex AI service and select the next combinations. - Define type of machine.
- Tune hyperparemeters.
- Get a result. The result of each trial is represented as protocol buffer including hyperparameters and results.
For specific codes, see ‘Go for Codes’.
All images, except those with separate source indications, are excerpted from lecture materials.
댓글남기기