Docs / AI & Machine Learning / How to Set Up Label Studio for Data Annotation

How to Set Up Label Studio for Data Annotation

By Admin · Mar 2, 2026 · Updated Apr 23, 2026 · 29 views · 3 min read

How to Set Up Label Studio for Data Annotation

Label Studio is an open-source data labeling platform that supports text, image, audio, video, and time-series annotation. Running it on your Breeze gives your team a centralized, private environment for creating high-quality training datasets for machine learning models.

Prerequisites

  • A Breeze instance with at least 2 GB of RAM
  • Python 3.9 or later
  • A PostgreSQL database (recommended for production)

Installing Label Studio

python3 -m venv ~/labelstudio-env
source ~/labelstudio-env/bin/activate
pip install label-studio

Starting the Server

Launch Label Studio with default settings:

export LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
export LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/data/labeling
label-studio start --host 0.0.0.0 --port 8080

On first launch, you will be prompted to create an admin account. Access the interface at http://your-breeze-ip:8080.

Using PostgreSQL for Production

For multi-user environments, configure PostgreSQL as the backend:

sudo -u postgres psql -c "CREATE DATABASE labelstudio;"
sudo -u postgres psql -c "CREATE USER ls_user WITH PASSWORD 'secure_password';"
sudo -u postgres psql -c "GRANT ALL PRIVILEGES ON DATABASE labelstudio TO ls_user;"

export DJANGO_DB=default
export POSTGRE_NAME=labelstudio
export POSTGRE_USER=ls_user
export POSTGRE_PASSWORD=secure_password
export POSTGRE_HOST=localhost
export POSTGRE_PORT=5432
label-studio start

Creating an Annotation Project

In the Label Studio interface, create a new project and select a labeling template. Common templates include:

  • Text Classification — assign labels to text documents
  • Named Entity Recognition — highlight and tag entities in text
  • Image Classification — categorize images into classes
  • Object Detection — draw bounding boxes around objects in images
  • Audio Transcription — transcribe and segment audio files

Custom Labeling Configuration

Define custom labeling interfaces using XML. For example, a sentiment analysis task:

<View>
  <Text name="text" value="$text"/>
  <Choices name="sentiment" toName="text" choice="single">
    <Choice value="Positive"/>
    <Choice value="Negative"/>
    <Choice value="Neutral"/>
  </Choices>
</View>

Importing Data

Import tasks from local files, cloud storage, or via the API:

curl -X POST http://localhost:8080/api/projects/1/import \
  -H "Authorization: Token YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '[{"text": "This product is amazing!"}, {"text": "Terrible customer service."}]'

Exporting Annotations

Export completed annotations in various formats for training:

curl -X GET "http://localhost:8080/api/projects/1/export?exportType=JSON" \
  -H "Authorization: Token YOUR_API_TOKEN" \
  -o annotations.json

Supported export formats include JSON, CSV, COCO (for object detection), CoNLL (for NER), and YOLO.

Setting Up Webhooks for ML Backend

Label Studio can connect to an ML backend for pre-annotations, where a model suggests labels that annotators then verify or correct:

pip install label-studio-ml
label-studio-ml init my_ml_backend --script ml_model.py
label-studio-ml start my_ml_backend --port 9090

Connect the ML backend in Label Studio’s project settings under Machine Learning. This dramatically speeds up the annotation process on your Breeze.

Running as a Systemd Service

[Unit]
Description=Label Studio
After=network.target postgresql.service

[Service]
User=deploy
Environment=LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
Environment=LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/data/labeling
ExecStart=/home/deploy/labelstudio-env/bin/label-studio start --host 0.0.0.0 --port 8080
Restart=always

[Install]
WantedBy=multi-user.target

Was this article helpful?