Troubleshooting Archive and Point Database Mismatches in PI System After OMF Endpoint Failures
A real-world case highlights how a mysterious OMF endpoint failure after stress-testing led to a PI system boot error caused by a point database and archive mismatch. See how `pibasess -snapfix` fixed a critical outage and get actionable troubleshooting tips.
Roshan Soni
Troubleshooting Point Database Mismatches in the PI System: Lessons from a Real-World OMF Endpoint Failure
Working with the PI System's OMF (OSIsoft Message Format) endpoints can be powerful, but as with many complex systems, edge cases can reveal subtle issues. In this post, I'll share a real-world experience that led to a critical PI system outage, its diagnosis, and ultimately its fix. If you're working with PI Web API, OMF data ingestion, or regular backfilling and analysis operations, these lessons could save your system from a similar fate.
The Problem: When OMF Data Submission and PI System Management Break
A PI System user was stress-testing their OMF endpoint by backfilling a PI point with a year's worth of second-interval data—an operation generating over 31 million data points. After successfully importing the data, the user attempted to analyze this trend via PI System Explorer by running queries over various time ranges. Suddenly, the system became unresponsive; after forcefully terminating the hung process, things went from bad to worse:
- Dynamic Type Deletion and Data Ingestion via OMF Endpoint Stopped Working:
- Deleting OMF containers/types failed with cryptic error codes (e.g., 4003, -3012).
- Submitting new OMF data resulted in errors (e.g., error code 6001).
- Manual Deletion via PI System Explorer Didn't Help: Deleting problematic points directly did not resolve type deletion issues.
- Rebooting the PI Application Server Had No Effect: The underlying problem persisted.
Diagnosis: Hidden Point Database Corruption
Upon eventually rebooting the main PI System server, the following fatal error appeared:
Fatal error in PI subsystem piarchss: Archive Point Count: 953, Point Database count: 952, status: [-11162] Archive and Base Point Count Mismatch - Contact Technical Support
The mismatch in point counts between the archive and point database prevented the system from booting—clearly indicating deeper corruption or inconsistency.
Why Can This Happen?
Backfilling large volumes of high-frequency data, especially while simultaneously querying and managing PI Points (e.g., renaming, trend analyses, or mass deletions via OMF/Explorer), can temporarily overwhelm PI subsystems responsible for metadata management and archival. In rare cases, as experienced here, the point database and archive's internal counters can become desynchronized, leading to unmanageable system states.
The Solution: pibasess -snapfix to the Rescue
Standard troubleshooting steps—restarting services, rebooting servers, inspecting Windows Event logs—did not surface or resolve the underlying problem. Only upon reboot did the system reveal the root cause: the point count mismatch.
To fix:
-
Run the Snapfix Utility on the affected PI server:
pibasess -snapfixThis utility reconciles the point count between the point database and the archive, clearing the corrupt state.
-
Restart the PI System: The error was gone, and deleting OMF types and submitting new data functioned as expected again.
Lessons Learned
- PI System Consistency Is Critical: Large, rapid-ingest workflows or heavy administrative operations can sometimes leave the system in an inconsistent state. Be especially careful when mixing high frequency backfilling and metadata operations.
- Point Count Mismatches Can Be Obscure: These may not surface until a reboot or PI subsystem restart. Always monitor archive/point database logs after large-scale changes.
- Logs May Not Directly Reveal the Cause: Windows event logs and error codes (4003, 6001, -3012) may not immediately indicate a metadata-level inconsistency—look for mismatches during service startup.
pibasess -snapfixIs a Powerful Tool: If you hit archive/point database mismatch errors, this should be your go-to fix (but use with care—always consult OSIsoft documentation or technical support if unsure).
Conclusion
While PI System and the OMF ecosystem are robust, effective troubleshooting sometimes relies on low-level administrative utilities not often encountered in day-to-day PI development. If your system exhibits mysterious metadata or OMF endpoint issues following heavy data operations, don't forget to check for archive/point mismatches—and remember that pibasess -snapfix can save the day.
Have you experienced similar issues or have tips for bulk data management and troubleshooting in PI? Share your experiences in the comments below!
Tags
About Roshan Soni
Expert in PI System implementation, industrial automation, and data management. Passionate about helping organizations maximize the value of their process data through innovative solutions and best practices.
No comments yet
Be the first to share your thoughts on this article.
Related Articles
Enhancing PI ProcessBook Trends with Banding and Zones: User Needs, Workarounds, and the Road Ahead
A look at the user demand for trend banding/zoning in OSIsoft PI ProcessBook, current VBA workarounds, UI challenges, and how future PI Vision releases aim to address these visualization needs.
Roshan Soni
Migrating PIAdvCalcFilVal Uptime Calculations from PI DataLink to PI OLEDB
Learn how to translate PI DataLink's PIAdvCalcFilVal advanced calculations—like counting uptime based on conditions—into efficient PI OLEDB SQL queries. Explore three practical approaches using PIAVG, PIINTERP, and PICOunt tables, and get tips for validation and accuracy.
Roshan Soni
Understanding PI Web API WebID Encoding: Can You Generate WebIDs Client-Side?
Curious about how PI Web API generates WebIDs and whether you can encode them client-side using GUIDs or paths? This article explores the encoding mechanisms, current documentation, and best practices for handling WebIDs in your applications.
Roshan Soni