AWS Transcription: User Guide

Overview

SSD Scribe is a transcription service offered by the Social Sciences Division. It leverages Amazon Web Services (AWS) for secure storage and automated transcription. The service is designed to quickly produce initial draft transcripts, which can then be refined and analyzed using other University of Chicago resources.

See the SSCS Researcher Transcription Service webpage for more information.

Supported Features

Before Beginning: Review Supported Features

Supported Languages
Supported languages and language-specific features – Amazon Transcribe

Supported File Formats
Data input and output – Amazon Transcribe, see Media Formats-Supported Formats

Naming Conventions
This service uses Amazon S3 storage service, and all uploaded files must follow Amazon S3 object key naming conventions. Please make sure that your file names are valid.

Audio File Time Limitations
Audio files longer than three hours are not supported in this version. SSCS recommends keeping files under two hours for best results.

Multi-Speaker Limitation
The service will automatically identify speakers in the transcript, with support for up to nine distinct speakers. The transcription model automatically assigns numeric labels to speakers, starting with a default identifier of 0.

Automatic Multi-Language Detection
SSD Scribe is designed to recognize multiple languages. It is currently unclear whether there is a limit to the number of languages that can be identified within a single file. 

Custom Dictionary
SSD Scribe allows you to upload a custom dictionary to ensure technical and domain-specific terms are accurately represented in your transcripts. If you have uncommon words that need special attention, or if you prefer certain words to be transcribed in a specific way—such as “year nineteen-twenty” instead of “1920”—using a custom dictionary is recommended. Please note that using a custom dictionary disables automatic multi-language detection, so you will need to specify the processing language.

Custom Language Model
SSD Scribe supports training a custom language model for transcription. Training models shape the transcription output by saying, “These words (e.g. words, phrases, pronunciation patterns, domain terms) exist and appear together often.”

You can upload text files as training data, with a minimum of 100,000 words recommended for accuracy. After uploading your files, training the language model typically takes 6–10 hours, depending on data size.

*Currently supported languages include English, Spanish, German, Japanese, and Hindi. *Although the interface lists PDF files as acceptable, they often cause issues during training and should be avoided.

The custom language model also has a tuning data option available, which can be used after you train the model. Tuning data helps the model sharpen and calibrate those patterns so it performs better on the actual data. The tuning data are used to fine-tune and calibrate the model after it has learned, and tells the model, “Here are some examples of the exact style of text we want you to match. Adjust yourself to this specific style.”

 

Logging Into Your SSD Scribe Account

Before you begin, ensure your computer is connected to UChicago Wi-Fi when on campus, or to the UChicago Virtual Private Network (cVPN) when off campus.

In your web browser, navigate to https://scribe.ssd.uchicago.edu/. Click “Log In,” fill in your CNET credentials, and then complete the 2FA request.

Submit an Audio or Video File for Transcription

Step 1: On the SSD Scribe homepage once you have logged in, click on the “Transcribe” button to get started.

Step 2: Upload your audio or video files

To add files, you can either (1) drag and drop files into the indicated field or (2) click “browse,” select your audio files, and then click “Open.”

Step 3: Select language (Optional)

The automatic multi-language identification is the default setting. You can also specify only one language for processing. Note that if you choose to use a custom dictionary or custom language model, you must manually specify the processing language.

Step 4: Select custom dictionary (Optional)

If you previously uploaded a custom dictionary, select it from the dropdown menu. If not, simply skip this step.

Step 5: Select custom language model (Optional)

If you have trained a custom language model, select it from the dropdown menu. If not, simply skip this step.

Step 6: Submit files for transcription

When you’re ready to submit your files for transcription, click “Upload for Transcription.”

Step 7: Wait for the system to upload your files

You can monitor the upload progress on the screen. Depending on your internet connection, the process may take a few minutes.

Step 8: Transcription job queued

Once you see the green status window labeled “Transcription in Progress,” your upload is complete. You can close the window or click ‘View All Results’ to go to the History page and monitor your transcription files as they are processed. Depending on the size of your data and current server load, transcription may take anywhere from several minutes to about an hour.

Note: Occasionally, a bug may prevent the system from queuing your transcription after the upload finishes. If this happens, refresh the page and try uploading your files again.

Download the Transcription File

Step 1: Navigate to the ‘History’ page

After logging in to SSD Scribe, click the ‘History’ tab in the top navigation menu to access your transcription history from anywhere on the site.

Step 2: Check the status of your transcription jobs

On the History page, you can view all your transcription jobs, which are stored for up to 30 days. If a job is complete, you’ll see a green notification: “Transcription completed successfully.” If it’s still processing, you’ll see a yellow notification: “Transcription in progress.”

Step 3: Download your transcription files

Once the transcription job is complete, you can download the output file in any of three formats: JSON, DOCX, or CSV. Files not downloaded will be automatically deleted after 30 days.


Transcription files generated from interviews or data are as sensitive as the original input files and are subject to the requirements of the IRB protocol. By using this system, you acknowledge that you will comply with all IRB requirements.

Upload a Custom Dictionary

Step 1: Navigate to the ‘Dictionaries” page

After logging in to SSD Scribe, click the ‘Dictionaries’ button to get started.

 

Step 2: Upload your prepared custom dictionary text file

To add files, you can either (1) drag and drop the file into the designated field or (2) click “Browse,” select your custom dictionary file, and then click “Open.”

If you need to create a new custom dictionary file, click the “Sample Custom Vocabulary File” button to download a template. For first-time users, refer to the detailed instructions by clicking the “AWS guide for building a custom vocabulary file” button.

Note: The system enforces strict formatting requirements for uploaded files. If your file fails to upload, verify that all formatting rules are followed. Common issues include using an incorrect file type (only .txt files are supported) or including spaces in multi-word entries.

 

Step 3: Click “Register” to confirm upload of your dictionary

Click “Register” to confirm upload of your dictionary. You will see a progress bar which you can monitor for the uploading of your dictionary. This process typically takes a couple of minutes.

 

Step 4: Dictionary processed and ready for transcription

When your custom dictionary is processed and ready for transcription, you will see it registered with a green “READY” notification next to the file name.

Train a Custom Language Model

Step 1: Navigate to the ‘Language Models’ page

After logging in to SSD Scribe, click the ‘Language Models’ button to get started.

 

Step 2: Upload your prepared training files

Enter a name for your custom model and select the language for training. The default base model can remain as “Wide Band” for general audio or video transcription. If you are transcribing call recordings, choose “Narrow Band.”

To add training files for the custom language model, you can either drag and drop the file into the designated field or click “Browse,” select your custom dictionary file, and then click “Open.”

It is recommended that the uploaded files contain at least 100,000 words to achieve optimal accuracy. Training typically takes 6–10 hours, depending on the size of your data. Supported languages include English, Spanish, German, Japanese, and Hindi. *Although the interface lists PDF files as acceptable, they often cause issues during training and should be avoided.

 

Step 3: Indicate intention for uploaded files

For each uploaded file, specify whether it will be used as ‘Training Data’ or ‘Tuning Data.’

 

Step 4: Create language model

Once you have uploaded all your files for training and tuning data, click on “Create Language Model.”

 

When your model is in training, it will appear on the ‘Language Models’ page with the status “In Progress.”

 

Step 5: Custom language model ready for transcription

When your model is ready, it will appear on the ‘Language Models’ page with the status “Ready.”

Data Deletion Policy

Input and Output files in AWS are automatically deleted after 30 days. Please store your original and output files as specified in your IRB.

Support

For training or troubleshooting support, please contact SSCS Teaching and Technology. You can contact the T&T team directly at ssdtnt@uchicago.edu.