Home¶

Overview¶

Search HQ is an advanced search management platform that leverages the power of OpenSearch and AWS services to deliver high-performance, scalable, and versatile search solutions. It features a robust OpenSearch API Gateway, dynamic Auto Scaling capabilities, and Hybrid Search techniques that combine keyword and semantic search for enhanced search relevance.

Below are the key features that make Search HQ stand out:

Key Features¶

1. OpenSearch API Gateway¶

The OpenSearch API Gateway provides a structured way to manage search-related operations and offers seamless integration with other AWS resources, enhancing the overall functionality and efficiency beyond just OpenSearch itself:

Endpoints:

/bulk: Supports bulk operations, enabling the simultaneous upload or modification of multiple documents. This endpoint is optimized to handle large volumes of data efficiently. For Hybrid Search, it automatically performs embedding inference during data ingestion, significantly enhancing speed and performance.
/search: Facilitates standard search queries.
/hybrid-search: Combines keyword and semantic search for more accurate results, leveraging advanced algorithms to provide better search relevance.
/raw-backend: Directly interacts with the native OpenSearch REST API, allowing for CRUD operations on the OpenSearch indices without any custom optimizations. This endpoint maintains the original behavior and functionality of OpenSearch.

Integration with AWS Services:

The API Gateway, in conjunction with AWS Lambda, interacts with various AWS resources to extend its capabilities beyond OpenSearch:

Amazon SageMaker / Bedrock: For embedding inference to enhance search results by understanding the context and semantics of queries.
AWS Personalize: To deliver personalized search experiences.
Amazon S3: To store logs and other data, ensuring reliable and scalable logging and data management.
Amazon Data Firehose: Streams data to various destinations, including Amazon S3, providing robust data flow management.

This integration with AWS resources amplifies the capabilities of OpenSearch, offering a more powerful, scalable, and versatile search solution.

2. OpenSearch Auto Scaling¶

OpenSearch Auto Scaling ensures that the OpenSearch cluster can handle varying loads efficiently by automatically adjusting its resources based on CPU performance rather than just disk usage, providing a more responsive and dynamic scaling solution:

Components:

CloudWatch Alarms: Monitor critical metrics such as CPU usage, which is the primary indicator for scaling decisions. Additional alarms can monitor storage capacity and other performance indicators.
EventBridge Rules: Define rules that trigger actions based on the alarms. For instance, a rule might trigger a Lambda function when CPU usage exceeds a certain threshold.
Lambda Functions: Execute the scaling operations, adding or removing nodes based on the current demand.

Scaling Criteria:

Primarily based on CPU utilization, ensuring that the cluster remains responsive and performant under varying loads.
Storage usage and latency can also be considered, providing a comprehensive approach to resource management.

This feature helps maintain optimal performance and cost-efficiency by dynamically adjusting resources based on real-time CPU performance metrics, rather than just disk usage.

3. Hybrid Search¶

Hybrid Search combines traditional keyword-based search with advanced semantic search techniques to deliver more relevant search results, with several optimizations and enhancements over the native OpenSearch capabilities:

Optimizations:

Parallel Inference: Supports parallel processing for document and field embeddings, making the indexing process faster and more efficient.
State-of-the-Art Models: Easy-to-use SOTA models are pre-packaged and integrated, simplifying their deployment and usage.
Fusion Techniques: Provides a variety of methods to combine keyword and semantic scores, offering more diverse and effective search result fusion strategies.

This feature leverages both keyword and semantic search strengths, providing more accurate and comprehensive search results. It is particularly useful in scenarios where users might use varied terminologies to describe the same concept, and it offers enhanced performance and ease of use compared to native OpenSearch.

Setup Instructions¶

Follow the instructional guide for Search HQ Launch Pack to set up the instance, enabling you to use the search-hq CLI commands.

Launch Our Product on AWS Marketplace¶

Step 1: Go to AWS Marketplace¶

Visit our product's page on AWS Marketplace.

Step 2: Select the Product¶

On the product details page, click the "Continue to Subscribe" button.

Step 3: Choose EC2 Launch¶

On the subscription confirmation page, click the "Continue to Configuration" button.
Choose your configuration options, such as software version and region, then click the "Continue to Launch" button.
On the launch page, select "Launch through EC2".
Configure the EC2 instance as per your needs, then click the "Launch" button.

Important Considerations During Launch

Name: Assign a name to your instance, such as search-hq.
Instance Type: Select an instance type with at least 4 GB of memory, such as t4g.medium.
Key Pair: Create a new key pair or select an existing one. You will need the private key file (.pem) to connect to your instance via SSH. For more details, refer to the AWS Key Pair documentation.
VPC Configuration: Configure the VPC settings to ensure network connectivity for your instances. If you are unsure, you can use the default VPC provided by AWS. For more details, refer to the AWS VPC documentation.

Step 4: Connect to Your EC2 Instance¶

Open an SSH client.
Locate your private key file. The key used to launch this instance is YOUR-KEY-PAIR.pem
Run this command, if necessary, to ensure your key is not publicly viewable.
```
chmod 400 "YOUR-KEY-PAIR.pem"
```
Connect to your instance using its Public DNS.
```
ssh -i YOUR-KEY-PAIR.pem ubuntu@EC2-INSTANCE-PUBLIC-DNS
```
For example, if your key pair file is named search-hq.pem and your instance's public DNS is ec2-43-207-37-42.ap-northeast-1.compute.amazonaws.com, the command would look like this:
```
ssh -i "search-hq.pem" ubuntu@ec2-43-207-37-42.ap-northeast-1.compute.amazonaws.com
```

Step 5: Configure AWS Permissions¶

Setup Permissions

It is recommended to attach the AdministratorAccess AWS managed policy to the IAM role associated with your EC2 instance. This ensures that the role has comprehensive permissions required for the setup.

Once connected to your EC2 instance, ensure that the instance has the necessary AWS permissions. You can configure this in several ways:
- IAM Roles: Attach an appropriate IAM role to the EC2 instance with the necessary permissions. This method is recommended as it provides temporary security credentials and simplifies access key management. For more information, visit IAM Roles for Amazon EC2.
- AWS CLI Configuration: Manually configure AWS credentials using the aws configure command:
```
aws configure
```
- Environment Variables: Set AWS credentials and region using environment variables:
```
export AWS_ACCESS_KEY_ID=your_access_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_access_key
export AWS_DEFAULT_REGION=your_region
```
- Configuration Files: Use the AWS credentials and config files located in ~/.aws/credentials and ~/.aws/config. For more information, visit Configuration and Credential Files.
Use the following command to verify the IAM role and permissions:
```
aws sts get-caller-identity
```
This command confirms that the instance has the correct IAM role and permissions by returning the AWS account and IAM role details. If the command executes successfully, your setup is correct.

Step 6: Using AWS Auth for OpenSearch Permission Management¶

If you want to use AWS authentication for managing permissions in OpenSearch, please remember to add the IAM role for the API router Lambda to the role mapping with cluster and indices permissions.

For more details on role mapping, you can refer to the OpenSearch Role Mapping Documentation.