Instructional Guide for Model Configuration¶
This guide will show you how to configure the conf/model.yaml
file for model settings. We offer state-of-the-art embedding models hosted by SageMaker and Bedrock, including Cohere and Splade. The guide is divided into three sections: Getting Started, Parameter Definitions, and Examples.
Getting Started¶
We offer four models for generating semantic embeddings, each with specific settings that you need to configure. Follow these steps to complete your YAML configuration file:
Step 1: Configure the bge-large-en-v1.5 Model¶
Set the initial_instance_count
and instance_type
fields for the bge_large_en_v1_5
model.
bge_large_en_v1_5:
initial_instance_count: 1 # Set the initial number of instances (adjust based on your needs)
instance_type: "ml.g4dn.xlarge" # Choose the appropriate instance type (e.g., "ml.g4dn.xlarge")
Step 2: Configure the splade-v3 Model¶
Set the initial_instance_count
and instance_type
fields for the splade_v3
model.
splade_v3:
initial_instance_count: 1 # Set the initial number of instances (adjust based on your needs)
instance_type: "ml.g4dn.xlarge" # Choose the appropriate instance type (e.g., "ml.g4dn.xlarge")
Note
To use splade_v3
, you need to log in to your Hugging Face account.
Step 3: Configure the cohere.embed-english-v3 Model¶
Set the query_concurrency
field for the cohere_embed_english_v3
model.
cohere_embed_english_v3:
query_concurrency: 4 # Set the concurrency for queries (adjust based on your needs)
Note
To use cohere_embed_english_v3
, you need to gain access to the model. For detailed instructions, refer to Manage access to Amazon Bedrock foundation models.
Step 4: Configure the cohere.embed-multilingual-v3 Model¶
Set the query_concurrency
field for the cohere_embed_multilingual_v3
model.
cohere_embed_multilingual_v3:
query_concurrency: 4 # Set the concurrency for queries (adjust based on your needs)
Note
To use cohere_embed_multilingual_v3
, you need to gain access to the model. For detailed instructions, refer to Manage access to Amazon Bedrock foundation models.
To note that, if you only need to use a subset of the models listed above, you can comment out the sections for the models you do not need by using #. This allows you to easily configure only the models you intend to use. Here is an example:
## Configuration for bge-large-en-v1.5 model
bge_large_en_v1_5:
initial_instance_count: 1 # Set the initial number of instances (adjust based on your needs)
instance_type: "ml.g4dn.xlarge" # Choose the appropriate instance type (e.g., "ml.g4dn.xlarge")
## Configuration for splade-v3 model
# splade_v3:
# initial_instance_count: 1 # Set the initial number of instances (adjust based on your needs)
# instance_type: "ml.g4dn.xlarge" # Choose the appropriate instance type (e.g., "ml.g4dn.xlarge")
## Configuration for cohere.embed-english-v3 model
# cohere_embed_english_v3:
# query_concurrency: 4 # Set the concurrency for queries (adjust based on your needs)
## Configuration for cohere.embed-multilingual-v3 model
# cohere_embed_multilingual_v3:
# query_concurrency: 4 # Set the concurrency for queries (adjust based on your needs)
In the above examples, all the models have been commented out except for the bge_large_en_v1_5
model.
Before commenting out any models, review the conf/hybrid_search.yaml
file to confirm which models are being used. Avoid commenting out the models you need.
Step 5: Save Configuration¶
Save your YAML file after filling in the necessary information.
Parameter Definitions¶
This section provides a detailed explanation of each parameter in the YAML file.
Field name | Field Type | Required | Description |
---|---|---|---|
bge_large_en_v1_5.initial_instance_count |
Int | Optional | The initial number of instances to launch for the bge_large_en_v1_5 model. |
bge_large_en_v1_5.instance_type |
String | Optional | The type of instance to use for the bge_large_en_v1_5 model (e.g., ml.g4dn.xlarge ). |
splade_v3.initial_instance_count |
Int | Optional | The initial number of instances to launch for the splade_v3 model. |
splade_v3.instance_type |
String | Optional | The type of instance to use for the splade_v3 model. |
cohere_embed_english_v3.query_concurrency |
Int | Optional | The concurrency for queries for the cohere_embed_english_v3 model. |
cohere_embed_multilingual_v3.query_concurrency |
Int | Optional | The concurrency for queries for the cohere_embed_multilingual_v3 model. |
Examples¶
Here are some examples to help you understand how to configure the YAML file.
Example 1: Basic Configuration¶
## Configuration for bge-large-en-v1.5 model
bge_large_en_v1_5:
initial_instance_count: 1 # Set the initial number of instances
instance_type: "ml.g4dn.xlarge" # Choose the appropriate instance type
## Configuration for splade-v3 model
splade_v3:
initial_instance_count: 1 # Set the initial number of instances
instance_type: "ml.g4dn.xlarge" # Choose the appropriate instance type
## Configuration for cohere.embed-english-v3 model
cohere_embed_english_v3:
query_concurrency: 4 # Set the concurrency for queries
## Configuration for cohere.embed-multilingual-v3 model
cohere_embed_multilingual_v3:
query_concurrency: 4 # Set the concurrency for queries
Example 2: Advanced Configuration with Custom Values¶
## Configuration for bge-large-en-v1.5 model
bge_large_en_v1_5:
initial_instance_count: 2 # Set the initial number of instances
instance_type: "ml.p3.2xlarge" # Choose a more powerful instance type
## Configuration for splade-v3 model
splade_v3:
initial_instance_count: 3 # Set the initial number of instances
instance_type: "ml.g4dn.12xlarge" # Choose a larger instance type
## Configuration for cohere.embed-english-v3 model
cohere_embed_english_v3:
query_concurrency: 8 # Increase the concurrency for queries
## Configuration for cohere.embed-multilingual-v3 model
cohere_embed_multilingual_v3:
query_concurrency: 6 # Set a custom concurrency for queries
Example 3: Using splade_v3
Only¶
## Configuration for bge-large-en-v1.5 model
# bge_large_en_v1_5:
# initial_instance_count: 1 # Set the initial number of instances
# instance_type: "ml.g4dn.xlarge" # Choose the appropriate instance type
## Configuration for splade-v3 model
splade_v3:
initial_instance_count: 1 # Set the initial number of instances
instance_type: "ml.g4dn.xlarge" # Choose the appropriate instance type
## Configuration for cohere.embed-english-v3 model
# cohere_embed_english_v3:
# query_concurrency: 4 # Set the concurrency for queries
## Configuration for cohere.embed-multilingual-v3 model
# cohere_embed_multilingual_v3:
# query_concurrency: 4 # Set the concurrency for queries