feat: instrumentation based opentelemetry collection #3361
+1,934
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
feat: Add OpenTelemetry instrumentation support
Summary
This PR introduces comprehensive OpenTelemetry (OTel) instrumentation support for Crawlee, enabling users to trace and monitor their crawlers with industry-standard observability tools. The implementation includes a new
@crawlee/otelpackage that provides automatic instrumentation for core crawler operations, manual span wrapping utilities, and log forwarding capabilities.What's New
New Package:
@crawlee/otelA dedicated OpenTelemetry instrumentation package that integrates seamlessly with Crawlee crawlers:
Automatic Instrumentation: Automatically instruments core crawler methods including:
BasicCrawler.run(),_runTaskFunction(),_requestFunctionErrorHandler(),_handleFailedRequestHandler(),_executeHooks()BrowserCrawler._handleNavigation(),_runRequestHandler()HttpCrawler._handleNavigation(),_runRequestHandler()Manual Instrumentation:
wrapWithSpan()utility function for wrapping custom handlers, hooks, and error handlers with OpenTelemetry spansLog Forwarding: Automatic forwarding of Crawlee logs to OpenTelemetry logs with proper severity level mapping
Custom Instrumentation: Support for instrumenting custom class methods with configurable span names and attributes
Key Features
Implementation Details
Core Components
CrawleeInstrumentation: Main instrumentation class extending OpenTelemetry'sInstrumentationBasewrapWithSpan(): Utility function for wrapping functions with spans, supporting dynamic span names and attributes@apify/logto forward logs to OpenTelemetryInstrumented Methods
The following methods are automatically instrumented when
requestHandlingInstrumentationis enabled:BasicCrawlerruncrawlee.crawler.runBasicCrawler_runTaskFunctioncrawlee.crawler.runTaskFunctionBasicCrawler_requestFunctionErrorHandlercrawlee.crawler.requestFunctionErrorHandlerBasicCrawler_handleFailedRequestHandlercrawlee.crawler.handleFailedRequestHandlerBasicCrawler_executeHookscrawlee.crawler.executeHooksBrowserCrawler_handleNavigationcrawlee.browser.handleNavigationBrowserCrawler_runRequestHandlercrawlee.browser.runRequestHandlerHttpCrawler_handleNavigationcrawlee.http.handleNavigationHttpCrawler_runRequestHandlercrawlee.http.runRequestHandlerRequest handler spans automatically include attributes:
crawlee.request.idcrawlee.request.urlcrawlee.request.methodcrawlee.request.retry_countUsage Examples
Basic Setup
Manual Span Wrapping
Custom Instrumentation
Documentation
docs/guides/trace-and-monitor-crawlers.mdxcovering:Testing
Comprehensive test coverage including:
Unit Tests:
wrapWithSpanfunctionality (sync/async, errors, context)Integration Tests:
Dependencies
Peer Dependencies
@opentelemetry/api:^1.3.0@opentelemetry/api-logs:^0.210.0Dependencies
@opentelemetry/instrumentation:^0.210.0Related Issues
Closes #2955
Note: This implementation follows the OpenTelemetry instrumentation best practices and integrates seamlessly with the OpenTelemetry ecosystem. Users can export traces to any OpenTelemetry-compatible backend (Jaeger, Zipkin, Signoz, etc.).