
Get Data Into Splunk - OTel

You need to understand Data Flow in the OpenTelemetry Collector

Data Flow

Pipelines: Central concepts of OpenTelemetry Collector (3 types).

  1. Metric Pipelines
  2. Trace Pipelines
  3. Log Pipelines


A Pipeline is made of components. There are three types of components:


Splunk Collector Components - list


receivers:                       \\ gather data from various sources
  hostmetrics:                   \\ Collects host-level metrics, such as CPU, memory, and disk usage, at a specified interval.
    collection_interval: 60s     \\ interval between metric collections
    scrapers:                    \\ Defines the specific metrics to collect 
  otlp:                          \\ The OpenTelemetry Protoco receiver gathers metrics, traces, and logs sent in OTLP format.
    protocols:                   \\ data transmission protocols
processors:                      \\ Processors are components that modify or enhance data after it’s received and before it’s sent to an exporter.
    timeout: 5s
    send_batch_size: 1024
    limit_mib: 500
    spike_limit_mib: 200
  resource:                      \\ Adds or modifies resource attributes (metadata) that are associated with all collected data.
      - key: environment
        action: insert            \\ Insert, Update, upsert, delete, hash & extract
        value: production
      - key:
        value: my-application
exporters:                        \\ Exporters send processed data to a specified backend or observability platform, such as Splunk.
    token: "<your-splunk-hec-token>"
    endpoint: "https://ingest.<your-splunk-cloud-instance>"
    source: "otel-collector"
    sourcetype: "_json"
    index: "main"
service:                          \\ The service block is where the previously configured components (extensions, receivers, processors, exporters) are enabled within the pipelines. 
  pipelines:                      \\ Each pipeline is a distinct data processing path within the collector
      receivers: [hostmetrics, otlp]
      processors: [batch, memory_limiter, resource]
      exporters: [splunk_hec]
      receivers: [otlp]
      processors: [batch, memory_limiter, resource]
      exporters: [splunk_hec]

Install & Configure the OpenTelemetry Steps

  1. Install the OpenTelemetry Collector to send server and cluster data (For Infra only)
    • Deploy Splunk Distribution of the OTel Collector
  2. (Optional) Configure third-party server applications to send metrics, logs, and traces
    • Configure the Collector’s native receivers or any of these third-party applications, such as Apache, Cassandra, Hadoop, Kafka, and NGINX, to monitor your systems
  3. Instrument back-end services and applications to send traces, logs, and metrics (For Infra & APM)
    • Edit the Configuration File – insert processor into pipeline
  4. Restart the OTel Collector
  5. Verify that the new dimension is being sent

Step-1: Install the OpenTelemetry Collector to send server and cluster data

Install the Splunk Distribution of OpenTelemetry Collector using the method that best suits your environment:

For Linux/Windows

  1. Data Management -> Add Integration -> Deploy the Splunk OpenTelemetry Collector

  1. Click “Next”

  1. Select Platform

  1. Install Configuration - Automatically populates depend on platform.



Provide all necessary details.

  1. Follow Installation Instructions and select the technology (Script/Ansible/puppet/Chef/salt) according to your requirement.

  1. Check that data is coming into Splunk Observability Cloud.

Step-2: (Optional) Configure third-party server applications to send metrics, logs, and traces

Configure the Collector’s native receivers or any of these third-party applications, such as Apache, Cassandra, Hadoop, Kafka, and NGINX, to monitor your systems

Step-3: Customizing the Agent Configuration File

Steps to add a component to a pipeline in the OpenTelemetry Collector

  1. First, you must create a block to define the component, and set any configurable options. Defining a component does not enable it – but it makes it available for you to use in a pipeline.
  2. To enable a defined component, place it in a pipeline. You can re-use the same defined component in multiple pipelines. You can also define the same component more than once, with different configurations.

Example 1:

  1. Add a new processor of type resource with the label add_environment.
    • The processor must be under the processors block
    • Give the new processor a label: resource/add_environment
  2. Configure the processor to add a new dimension as follows:
    • action: insert
    • key: deployment.environment
    • value: < use your name>

Note that this will be under attributes for the processor.

Task 2: Edit the Configuration File – insert processor into pipeline

  1. Add the processor into the list of processors in the default (unlabeled) metrics pipeline (service > pipeline > metrics > processors).

  1. Save the file.

Example 2:

  1. To mask sensitive data:
      - action: hash
        key: username
  1. Add processor components to pipeline
         - resources/mask_username  # Add processor to pipeline.
  1. Restart collector.

Step-4: Restarting Splunk OTel Collector

sudo systemctl restart splunk-otel-collector

Step-5: Verify that the new dimension is being sent


Troubleshooting Common Issues

Nore options:

Uninstalling the agent

sudo sh /tmp/ --uninstall

Customizing the Agent Configuration File

Pipeline Direct the Data Flow

  1. Receiver: Get data into the Collector from multiple sources.
  2. Processor: Perform operations on data before it’s exported. For example, filtering. Once in the pipeline, processor components filters, manipulates or extends the data.
    • Memory Limiter: Prevent out of memory situation on the collector. It restricts the data processed so that the collector process stays within its memory limitations.
    • Batch Processor: It accepts trace spans, metrics or logs places them to batches. Batching helps better compress the data and reduce the number of outgoing connections required to transmit the data.
    • Resource Detection: It enhance telemetry data with labels that describe the underlying host machine.
  3. Exporter: Send data to one or more backends or destinations (Ex: Splunk Observability Cloud).
  4. Extensions: Extend the capabilities of the Collector.

Configuration files


Pipelines are defined in the service block of the configuration file. The code above defines a metrics pipeline.

Extension block

Extend the capabilities of the Collector

Receivers block

Processors block

Exporter block

Service block

Service block contains diffferent pipelines


It contains environment variables of our installations.

AlwaysOn Profiling

About Profiling/ AlwaysOn profiling refer doc.