How to Efficiently Query High-Volume PI Archive Data Using PI OLEDB and .NET
Learn how to optimize .NET applications to query thousands of PI tags efficiently using PI OLEDB, including batching, SQL tricks, threading, and alternatives.
Roshan Soni
Performance Optimization for High-Volume PI Data Queries in .NET Applications
Introduction
Developers working with the OSIsoft PI System often need to retrieve large volumes of data from the PI Data Archive, such as querying thousands of tags at periodic intervals. Using the PI OLEDB provider can make this task straightforward, but achieving optimal performance and scalability requires careful design choices. In this post, we'll explore real-world strategies, benchmarks, and best practices for high-frequency, high-volume PI data retrieval using PI OLEDB—complemented with .NET techniques and alternatives.
The Scenario
Suppose you have a .NET service that needs to retrieve recent data for ~3000 PI tags every 60 seconds. Each tag must be queried with a timestamp range (often starting from the last received value). A simple, naïve approach would be to execute a separate SQL-like query for each tag, for example:
SELECT tag, time, value, status FROM piarchive..picomp2
WHERE tag = 'TagA'
AND time > ? AND time <= ?
ORDER BY time ASC
However, as system and data volume scale, so does query time—potentially straining the PI system and the client application.
Optimization Strategies and Findings
1. Query Multiple Tags with WHERE tag IN (...)
Grouping tags into a single query, for example:
SELECT tag, time, value, status FROM piarchive..picomp2
WHERE tag IN ('TagA', 'TagB', 'TagC')
AND time > ? AND time <= ?
ORDER BY tag, time ASC
Results: Massive speedup compared to individual queries per tag. In one documented test, querying 850 tags dropped from 70-80 seconds (per-tag queries) to 15-20 seconds (single query with IN-list).
Caveats:
- All tags in the query must share the same time window. If you need different time ranges per tag, this approach doesn't apply.
- There may be practical limits on the length of the IN-list or SQL query size (database or network constraints).
2. Using UNION to Mix Time Ranges
When needing per-tag time ranges, some developers try UNION to chain together many single-tag queries in one statement:
SELECT ... WHERE tag = 'TagA' AND ...
UNION
SELECT ... WHERE tag = 'TagB' AND ...
-- etc.
Results:
- Only minor improvements for fewer than 30 tags/unions.
- With many tags, performance gains fade and queries get unwieldy.
3. Leveraging OLEDB Parameters
Parameterized queries not only avoid SQL injection but, crucially, allow command reuse and may reduce query parsing/compilation overhead:
- Prepare a query like the first example, substitute parameters for tag and time range, and execute it repeatedly.
- Many users report substantial improvements when reusing prepared commands for a bulk of per-tag queries.
4. Parallelization and Multithreading
Since the bottleneck is usually query execution, parallelizing the queries can harness more CPU and network bandwidth—mimicking how PI OLEDB itself dispatches underlying PISDK queries asynchronously.
- .NET Task Parallel Library (TPL): Easy and robust way to run queries in parallel groups using
Parallel.ForEachwith long-lived connections per thread. - Connection Pooling: Connection overhead is usually negligible since .NET pools OleDb connections by default.
5. Using PI-SDK or AFSDK Directly
For advanced users, directly leveraging PI-SDK or AF SDK with asynchronous calls may yield extra performance (e.g., through batching, streaming, or custom optimizations). However, such approaches come with higher complexity and a steeper learning curve.
Other Considerations
- Query Impact on PI Server: Issuing many thousands of queries in parallel can strain the PI Data Archive if not tuned properly. Monitor PI system health and leverage batch/parallelizing judiciously.
- Time Range Design: If possible, align tag time windows to minimize the number of distinct queries. If you must track precise per-tag progress, consider whether you can tolerate some degree of lag (e.g., query for larger windows and filter in code).
- Result Size and Efficiency: The bottleneck is typically query execution, not the returned result size, based on user reports and benchmarks.
Recommendations
- Use
WHERE tag IN (...)queries when all tags can share a time window. Batch tags for each query to avoid SQL or provider limits. - For per-tag time windows, reuse prepared, parameterized commands and parallelize at the application layer using TPL or classic threading.
- Monitor PI Server load and tune parallelism to avoid unintended strain.
- Consider ELT (Extract-Load-Transform) patterns—fetch more data than needed, then filter/aggregate in the client.
- Evaluate AF SDK for new projects if you need more flexibility, as this is the strategic direction for OSIsoft PI development.
Conclusion
When querying PI Archive data for thousands of tags at frequent intervals, thoughtful batching and batching strategies can yield dramatic performance improvements. Direct, synchronous, per-tag queries simply don't scale. Leverage group queries, parameterization, and multithreading for robust high-throughput solutions—but always balance efficiency with system impact. If you're developing for the long term, explore AF SDK for the most flexible and modern access to PI data.
Keywords: PI OLEDB, PI System, performance, high-volume queries, .NET, parallelism, AF SDK, batching
Tags
About Roshan Soni
Expert in PI System implementation, industrial automation, and data management. Passionate about helping organizations maximize the value of their process data through innovative solutions and best practices.
No comments yet
Be the first to share your thoughts on this article.
Related Articles
Enhancing PI ProcessBook Trends with Banding and Zones: User Needs, Workarounds, and the Road Ahead
A look at the user demand for trend banding/zoning in OSIsoft PI ProcessBook, current VBA workarounds, UI challenges, and how future PI Vision releases aim to address these visualization needs.
Roshan Soni
Migrating PIAdvCalcFilVal Uptime Calculations from PI DataLink to PI OLEDB
Learn how to translate PI DataLink's PIAdvCalcFilVal advanced calculations—like counting uptime based on conditions—into efficient PI OLEDB SQL queries. Explore three practical approaches using PIAVG, PIINTERP, and PICOunt tables, and get tips for validation and accuracy.
Roshan Soni
Understanding PI Web API WebID Encoding: Can You Generate WebIDs Client-Side?
Curious about how PI Web API generates WebIDs and whether you can encode them client-side using GUIDs or paths? This article explores the encoding mechanisms, current documentation, and best practices for handling WebIDs in your applications.
Roshan Soni