You need to understand Data Flow in the OpenTelemetry Collector
Pipelines: Central concepts of OpenTelemetry Collector (3 types).
A Pipeline is made of components. There are three types of components:
Additional
Splunk Collector Components - list
Example:
receivers: \\ gather data from various sources
hostmetrics: \\ Collects host-level metrics, such as CPU, memory, and disk usage, at a specified interval.
collection_interval: 60s \\ interval between metric collections
scrapers: \\ Defines the specific metrics to collect
cpu:
load:
memory:
disk:
filesystem:
network:
otlp: \\ The OpenTelemetry Protoco receiver gathers metrics, traces, and logs sent in OTLP format.
protocols: \\ data transmission protocols
grpc:
http:
processors: \\ Processors are components that modify or enhance data after it’s received and before it’s sent to an exporter.
batch:
timeout: 5s
send_batch_size: 1024
memory_limiter:
limit_mib: 500
spike_limit_mib: 200
resource: \\ Adds or modifies resource attributes (metadata) that are associated with all collected data.
attributes:
- key: environment
action: insert \\ Insert, Update, upsert, delete, hash & extract
value: production
- key: service.name
value: my-application
exporters: \\ Exporters send processed data to a specified backend or observability platform, such as Splunk.
splunk_hec:
token: "<your-splunk-hec-token>"
endpoint: "https://ingest.<your-splunk-cloud-instance>.splunkcloud.com:8088"
source: "otel-collector"
sourcetype: "_json"
index: "main"
service: \\ The service block is where the previously configured components (extensions, receivers, processors, exporters) are enabled within the pipelines.
pipelines: \\ Each pipeline is a distinct data processing path within the collector
metrics:
receivers: [hostmetrics, otlp]
processors: [batch, memory_limiter, resource]
exporters: [splunk_hec]
traces:
receivers: [otlp]
processors: [batch, memory_limiter, resource]
exporters: [splunk_hec]
Install the Splunk Distribution of OpenTelemetry Collector using the method that best suits your environment:
For Linux/Windows
Linux:
Kubernetes:
Provide all necessary details.
Configure the Collector’s native receivers or any of these third-party applications, such as Apache, Cassandra, Hadoop, Kafka, and NGINX, to monitor your systems
Steps to add a component to a pipeline in the OpenTelemetry Collector
Example 1:
Note that this will be under attributes for the processor.
Task 2: Edit the Configuration File – insert processor into pipeline
Example 2:
processors:
resources/mask_username:
attributes:
- action: hash
key: username
service:
pipeline:
metrics:
receivers:
processors:
- resources/mask_username # Add processor to pipeline.
exporters:
sudo systemctl restart splunk-otel-collector
Troubleshooting Common Issues
Network connectivity - If the agent process is running, check that you can connect to Splunk Observability Cloud servers. You must be able to send data to the ingest API.
curl -o - -I https://ingest.<realm>.signalfx.com/v2/datapoint
The curl command will check that the endpoint is reachable, but it does not check that your account is properly configured
journalctl -t otelcol -r
Nore options:
Uninstalling the agent
sudo sh /tmp/splunk-otel-collector.sh --uninstall
Pipeline Direct the Data Flow
/etc/otel/collector/agent_config.yml
Pipelines are defined in the service block of the configuration file. The code above defines a metrics pipeline.
Extension block
Extend the capabilities of the Collector
Receivers block
Processors block
Exporter block
Service block
Service block contains diffferent pipelines
/etc/otel/collector/splunk-otel-collector.conf
It contains environment variables of our installations.
About Profiling/ AlwaysOn profiling refer doc.