Introduction
This documentation provides a step-by-step guide to building an event-driven architecture using AWS services that automatically converts audio files uploaded to an Amazon S3 bucket into text using the AWS Transcribe Service. This setup leverages several AWS services to create an efficient and automated workflow.
Architecture Overview
The architecture consists of the following AWS services:
Amazon S3: Used to store audio files.
AWS Lambda: Triggers the transcription process when a new audio file is uploaded to S3.
AWS Transcribe: Converts audio files to text.
Amazon SNS (Simple Notification Service): Notifies users when the transcription is complete.
Step-by-Step Implementation
Step 1: Create an S3 Bucket
Log in to your AWS Management Console.
Navigate to S3.
Click on Create bucket.
Enter a unique bucket name and choose a region.
Click Create bucket.
Step 2: Set Up AWS Lambda Function
Go to the AWS Lambda service.
Click on Create function.
Choose Author from scratch.
Enter a name for your function, e.g.,
TranscribeAudioFunction
.Choose a runtime (Node.js or Python is commonly used).
Under Permissions, select Create a new role with basic Lambda permissions.
Click Create function.
Step 3: Configure Lambda to Handle S3 Events
Scroll down to the Function code section.
Add the following code snippet in Python to handle the S3 event and start the transcription job:
import json
import boto3
def lambda_handler(event, context):
transcribe = boto3.client('transcribe')
s3 = boto3.client('s3')
# Extracting the bucket name and file name from the event
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
# Transcription job details
job_name = key.split('.')[0] # Use the file name without extension
job_uri = f"s3://{bucket}/{key}"
# Start transcription job
response = transcribe.start_transcription_job(
TranscriptionJobName=job_name,
Media={'MediaFileUri': job_uri},
MediaFormat='mp3', # Change as per your audio file format
LanguageCode='en-US' # Change as per your audio language
)
return {
'statusCode': 200,
'body': json.dumps('Transcription job started successfully!')
}
Step 4: Set Up S3 Event Notification
Go back to your S3 bucket and select the Properties tab.
Scroll down to Event notifications and click Create event notification.
Enter a name for your event.
Under Event types, select PUT (to trigger when a file is uploaded).
Under Destination, choose Lambda Function and select the Lambda function you created earlier.
Click Save changes.
Step 5: IAM Permissions for Lambda
Go to the IAM service in the AWS Management Console.
Select the role associated with your Lambda function.
Click on Attach policies and add the following policies:
AmazonS3ReadOnlyAccess
AmazonTranscribeFullAccess
AWSLambdaBasicExecutionRole
Step 6: (Optional) Set Up SNS for Notifications
Create a new SNS Topic in the SNS console.
Subscribe to the topic with your email to receive notifications.
Update your Lambda function to publish a message to the SNS topic upon transcription completion:
import boto3
sns = boto3.client('sns')
# After starting the transcription job
sns.publish(
TopicArn='arn:aws:sns:your-region:your-account-id:your-topic-name',
Message='Transcription job started: ' + job_name
)
Step 7: Test the Architecture
Upload an audio file to your S3 bucket.
Check the AWS Transcribe service to see if the transcription job has started.
If you’ve set up SNS, check your email for notifications.
Step 8: Retrieve Transcription Results
You can retrieve the transcription results by calling the get_transcription_job
method in another Lambda function or manually in the AWS Management Console. Here’s an example of how to retrieve the transcription results:
def get_transcription_result(job_name):
response = transcribe.get_transcription_job(TranscriptionJobName=job_name)
return response['TranscriptionJob']['Transcript']['TranscriptFileUri']
Conclusion
By following this documentation, you have successfully created an event-driven architecture that automates the transcription of audio files uploaded to S3 using AWS services. This setup not only enhances efficiency but also streamlines the handling of audio content, making it ideal for various applications such as transcription services, content creation, and accessibility solutions.