MyObservability

Get Data Into Splunk - OTel

You need to understand Data Flow in the OpenTelemetry Collector

Data Flow

Pipelines: Central concepts of OpenTelemetry Collector (3 types).

  1. Metric Pipelines
  2. Trace Pipelines
  3. Log Pipelines

Components

A Pipeline is made of components. There are three types of components:

Additional

Splunk Collector Components - list

Example:

receivers:                       \\ gather data from various sources
  hostmetrics:                   \\ Collects host-level metrics, such as CPU, memory, and disk usage, at a specified interval.
    collection_interval: 60s     \\ interval between metric collections
    scrapers:                    \\ Defines the specific metrics to collect 
      cpu:
      load:
      memory:
      disk:
      filesystem:
      network:
  otlp:                          \\ The OpenTelemetry Protoco receiver gathers metrics, traces, and logs sent in OTLP format.
    protocols:                   \\ data transmission protocols
      grpc:
      http:
processors:                      \\ Processors are components that modify or enhance data after it’s received and before it’s sent to an exporter.
  batch:
    timeout: 5s
    send_batch_size: 1024
  memory_limiter:
    limit_mib: 500
    spike_limit_mib: 200
  resource:                      \\ Adds or modifies resource attributes (metadata) that are associated with all collected data.
    attributes:
      - key: environment
        action: insert            \\ Insert, Update, upsert, delete, hash & extract
        value: production
      - key: service.name
        value: my-application
exporters:                        \\ Exporters send processed data to a specified backend or observability platform, such as Splunk.
  splunk_hec:
    token: "<your-splunk-hec-token>"
    endpoint: "https://ingest.<your-splunk-cloud-instance>.splunkcloud.com:8088"
    source: "otel-collector"
    sourcetype: "_json"
    index: "main"
service:                          \\ The service block is where the previously configured components (extensions, receivers, processors, exporters) are enabled within the pipelines. 
  pipelines:                      \\ Each pipeline is a distinct data processing path within the collector
    metrics:
      receivers: [hostmetrics, otlp]
      processors: [batch, memory_limiter, resource]
      exporters: [splunk_hec]
    traces:
      receivers: [otlp]
      processors: [batch, memory_limiter, resource]
      exporters: [splunk_hec]

Install & Configure the OpenTelemetry Steps

  1. Install the OpenTelemetry Collector to send server and cluster data (For Infra only)
    • Deploy Splunk Distribution of the OTel Collector
  2. (Optional) Configure third-party server applications to send metrics, logs, and traces
    • Configure the Collector’s native receivers or any of these third-party applications, such as Apache, Cassandra, Hadoop, Kafka, and NGINX, to monitor your systems
  3. Instrument back-end services and applications to send traces, logs, and metrics (For Infra & APM)
    • Edit the Configuration File – insert processor into pipeline
  4. Restart the OTel Collector
  5. Verify that the new dimension is being sent

Step-1: Install the OpenTelemetry Collector to send server and cluster data

Install the Splunk Distribution of OpenTelemetry Collector using the method that best suits your environment:

For Linux/Windows

  1. Data Management -> Add Integration -> Deploy the Splunk OpenTelemetry Collector

  1. Click “Next”

  1. Select Platform

  1. Install Configuration - Automatically populates depend on platform.

Linux:

Kubernetes:

Provide all necessary details.

  1. Follow Installation Instructions and select the technology (Script/Ansible/puppet/Chef/salt) according to your requirement.

  1. Check that data is coming into Splunk Observability Cloud.

Step-2: (Optional) Configure third-party server applications to send metrics, logs, and traces

Configure the Collector’s native receivers or any of these third-party applications, such as Apache, Cassandra, Hadoop, Kafka, and NGINX, to monitor your systems

Step-3: Customizing the Agent Configuration File

Steps to add a component to a pipeline in the OpenTelemetry Collector

  1. First, you must create a block to define the component, and set any configurable options. Defining a component does not enable it – but it makes it available for you to use in a pipeline.
  2. To enable a defined component, place it in a pipeline. You can re-use the same defined component in multiple pipelines. You can also define the same component more than once, with different configurations.

Example 1:

  1. Add a new processor of type resource with the label add_environment.
    • The processor must be under the processors block
    • Give the new processor a label: resource/add_environment
  2. Configure the processor to add a new dimension as follows:
    • action: insert
    • key: deployment.environment
    • value: < use your name>

Note that this will be under attributes for the processor.

Task 2: Edit the Configuration File – insert processor into pipeline

  1. Add the processor into the list of processors in the default (unlabeled) metrics pipeline (service > pipeline > metrics > processors).

  1. Save the file.

Example 2:

  1. To mask sensitive data:
processors:
  resources/mask_username:
    attributes:
      - action: hash
        key: username
  1. Add processor components to pipeline
service:
   pipeline:
      metrics:
         receivers:
         processors:
         - resources/mask_username  # Add processor to pipeline.
         exporters:
  1. Restart collector.

Step-4: Restarting Splunk OTel Collector

sudo systemctl restart splunk-otel-collector

Step-5: Verify that the new dimension is being sent

Troubleshooting

Troubleshooting Common Issues

Nore options:

Uninstalling the agent

sudo sh /tmp/splunk-otel-collector.sh --uninstall

Customizing the Agent Configuration File

Pipeline Direct the Data Flow

  1. Receiver: Get data into the Collector from multiple sources.
  2. Processor: Perform operations on data before it’s exported. For example, filtering. Once in the pipeline, processor components filters, manipulates or extends the data.
    • Memory Limiter: Prevent out of memory situation on the collector. It restricts the data processed so that the collector process stays within its memory limitations.
    • Batch Processor: It accepts trace spans, metrics or logs places them to batches. Batching helps better compress the data and reduce the number of outgoing connections required to transmit the data.
    • Resource Detection: It enhance telemetry data with labels that describe the underlying host machine.
  3. Exporter: Send data to one or more backends or destinations (Ex: Splunk Observability Cloud).
  4. Extensions: Extend the capabilities of the Collector.

Configuration files

/etc/otel/collector/agent_config.yml

Pipelines are defined in the service block of the configuration file. The code above defines a metrics pipeline.

Extension block

Extend the capabilities of the Collector

Receivers block

Processors block

Exporter block

Service block

Service block contains diffferent pipelines

/etc/otel/collector/splunk-otel-collector.conf

It contains environment variables of our installations.

AlwaysOn Profiling

About Profiling/ AlwaysOn profiling refer doc.