VOB file elements are corrupting with "bad character" and "text_file_delta" Errors.

Sebastian Specialist CM's picture

Hello,

Our VOB is of size size 52GB, 10years old, mounted on Solaris OS and using by 400 Users. From the past One Year, we observed that VOB elements are keep on corrupting with "^D" or "text_file_delta" error. We are unable to do diff with predecessor version, not able to make Builds because of corrupted files. Even we are unable to count Source Lines of Code also. I forgot to mention that, we are using ClearCase ver.6.0.

Last 8 months back, we raised a PMR, IBM resolved some file elements using "checkvob" and by remotely accessing they used some IBM scripts to solve the issue. Finally, by the end of March. We and IBM Technical Support Engineers resolved all corrupted file elements issue. Recently, we observed that 2 (two) of the files got corrupted again and VOB is not in Sync with two of our replicas. Problem started again.

Please let me know, why VOB file elements corrupts often? Is it because of, if any user uses some third party tools for his Coding/Development in VOB? We CMs follow strict polocies, some of the file elements are not even loading into VOB (Snapshot views), saying "size got exceeded". This became a big mess for us. Now, Developers are making Builds by hijacking the files. We know that, this is not a Good Practise. Please someone of you can help us in this regard. PFA, VOB Errors. Thanks in advance!

Best Regards/Sebastian

3 Answers

Bob Aiello's picture

Hi Sebastian,

ClearCase is far and away one of the most reliable version control repositories in the industry. But repository corruption does indeed happen. I had a situation where the ClearCase vob got corrupted when we were using a NAS repository for storage. We actually recreated the problem outside of ClearCase by just writing some files. The problem turned out to be a CIFS (e.g. windows) configuration issue on the NAS device itself. The problem actually had nothing to do with ClearCase.

There are five utilities used to check the integrity of the vob repository. checkvob should be run each night. I haven't used Clearcase in a few years so I am forgetting the names of the other utilities but I think checkdb (or is it dbcheck) verifies the underlying rama database.

What you really need to do is to move from using one monolithic ClearCase vob to a using bunch of vobs. Since you can script your administration commands having many vobs is just as easy to maintain as having only one or two. (Actually easier because you will have fewer problems)

My scripts would iterate over a list of vobs (excluding a couple of test vobs). Even with a corrupt vob you can often get out history for a specific slice of the vob (or get IBM to help you the use vob_splitter utility).

I used to periodically reformatvobs just to ensure that that they were ok and I setup a separate ClearCase registry on a separate machine to do this (you cannot use a copy of a vob in the same registry because the uuids will conflict).

You will see fewer problems with smaller vobs (even if you have a lot of them). We literally had hundreds of vobs and ran checkvob every night. I am knee deep in implementing RTC, TFS and Git but ClearCase still rocks!

Feel free to connect with me on linkedin or ping me via email ([email protected]) to discuss further and do post how this all works out!

Bob Aiello, Technical Editor CM Crossroads
http://www.linkedin.com/in/BobAiello

 

 

Sebastian Specialist CM's picture

Hello Bob,

Thanks for your response.
We are copying the Source Container of corrupted file element from our remote replica to our VOB file element's Source Container directory. Sometimes, it helps and when we place our remote replica's Source Container in our VOB. If these things doesn't materialize, then we go for "checkvob -fix <file_element>. This shows us how many versions got corrupted and it prompts whether it fix them by making respective versions
as "nodata(ndata)". Unfortunately, our CM Branch versions were also in this list. Before going for checkvob, it became compulsory for us to recover the corrupted versions to get the Codebase. But here in this scenario, we can't get the code inside the corrupted versions with respective CM Labels as config specs. Because, we are not allowed to do any operation, like... diff, open the version tree of that corrupted versioned label, vi to corrupted version, etc. We are in big trouble, as there are Maintenance Builds and Latest Builds that we need to send to our Clients.
Our CM Team need to explore more with your ideas given in your previous post. Because, IBM and our ClearCase admins tried a lot and they were successful last time. Now, the problem persists again.

Best Regards,
Sebastian

Tamir Gefen's picture

If corruption occurs more than once, I'd check the physical resources: network, disks, electricty. For instance, electrical outage could result in a loss of data affecting a specific version of an element

 

There are many other things to invesitage but I'm lack of time... I'd suggest you to download the free trial of ClearCheck from our website. It scans the VOB and generate a legible report. It might provide some clues...

 

Regards,

Tamir Gefen, GoMidjets

 

 

 

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.