I have a large binary file (84GB) that needs to be broken down into smaller file sizes (~1GB to 8GB) for analysis. The binary file is on a 32-bit machine and cannot be copied to another machine for analysis. The PC has Visual Studio 6.0 and is not upgradable. My issue is I'm using the following generic code to create the smaller files.
fseek(file, start, SEEK_SET);
end = start + (variable based on file size);
fseek(file, end, SEEK_SET);
for (i=start; i<end; i++) {
if(!feof(f)) {
byte = fgetc(f);
fputc(byte,new_file);
}
}
However, on a 32-bit machine, the iterator can only count up to ~2billion. Which means that I'm unable to copy anything past ~2GB. My original idea was to delete from the large binary file as I read from it so that I can reset the iterator on every read. However, I haven't come across a way to delete binary file entries.
Is there any other way that to break down a large binary file into smaller units? Or is there a way to delete binary file entries in sections or per entry?
On a 64-bit machine I could use _fseeki64. I've been reading that some versions of Visual 6.0 are capable of supporting 64-bit numbers but when using _fseeki64 or _lseeki64 on this machine its an "undeclared identifier"
I need to parse the file in such a way that I can create a filesystem hierarchy as if I were enumerating files/directories. Ultimately I want to add these to a tree gui control with everything under its proper node without duplicating anything. It should look roughly like so:
dir -file -dir -file -dir -file
I can open the file and add nodes/children to the tree control but how should I go about doing the actual parsing? How can I find a filename and say "this belongs under this node"? I want to do this efficient as possible even if I must use multiple threads.
So I'm attempting to write a program that will parse through a large file (genome sequences) and I'm basically wondering what options I should consider if I wanted to either:
a) store the entire genome in memory and then parse through it b) parse through a file in small portions
If I go with "a", should I just read a file into a vector and then parse through it? And if I go with "b" would I just use an input/output stream?
I'm parsing an xml file full of payslips and using the data in another application. I've got it all working but I suspect it isn't the most elegant piece of code. I run through the xml file finding a series of "Text" attributes/elements" and then I run through it again finding a series of "Field" attributes/elements, Here is a sample of the code:
// // Get all the Text attributes & Elements // foreach (XElement xtxt in xdoc.Descendants(ns + "Text"))
[Code]...
This works fine, I extract all the data I'm interested in and go to do my thing with it. However I really need to know when each record ends and I was doing that by looking for "Text24" in the text fields and "EeRef2" in the field fields, which wasn't very elegant in the first place. Then a "Text16" was added to end of each record which was fine I could just look for "Text16" but now it's apparent that "Text16" isn't always there. I've got it all working for now but I'd prefer to process one record at a time i.e. extract all the "Text" & "Field" values for one record, do whatever I need to do with it, update the xml file to indicate this progress ( if possible ) and then move on to the next record. I've attached a sample of the xml but basically is has the following structure :
1. first my professor required me NOT to change the MAIN function(because he made it)
2. I have to make 3 getlogs() STRING FUNCTIONS:
a. string getlogs(); - accepts no paramters, SHOWS ALL THE CONTENTS OF TEXT FILE b. string getLogs(const string & a); - accepts 1 parameter -SHOWS ONLY THE LINE WHICH CONTAINS THE SPECIFIED DATE FROM MAIN FUNCTION which is "2014-08-01" c. string getLogs(const string & b, const string & c); - accepts 2 parameters, SHOWS ONLY THE LINES FROM THE DATE START to DATE END specified at THE MAIN FUNCTION which is date start-"2014-08-01";DateEnd = "2014-08-10";
what's the ideal way to get an iterator to the item that has the largest key (int) smaller than a given value.
basically, the item before upper_bound(). I can use upper_bound() and then decrement, but it needs special cases for both end() and begin(), and in the case of end() I'm not sure how I get it to the last item in the map, afaik, we're not allowed to decrement end().
Code: auto it = mymap.upper_bound(x); if (it==mymap.begin()) // first item in the map is already too large. reject NotFound(); else if (it==mymap.end())
[Code] .....
// here it points to largest item smaller than x.
I can iterate over the entire map and do a compare, but then I pretty much loose the benefit of the binary search.
I have this file that I would like to read into a multidimenstional array in c#: file. If I take the first set of lines as a first example, I would like the print_r to look something like this:
Will the realloc just reduce the allocated size and keep the same pointer, or can there be a chance of it finding another place for that allocation ( Meaning that it will expensively move the memory to another location )?
I am trying to create efficient programs by making my dynamic allocations the least resource hungry as possible during runtime.
const string ABC = " A B C D A 1 -1 2 14 B 0 -2 -4 8 C 6 2 2 3
" so if i have it as a string stream and then loop through each line like this:
Code: istringstream in (ABC); for (string line; getline(in, line); ){ vector<char> vec(line.begin(), line.end()); for (int i = 0; i< vec.size(); i++) cout << vec[i] << " "; }
I get my strings chopped into characters. but how to chop it into "meaningful" characters so that -1 is not - and 1. is there any quick way for that to happen ??
I have been attempting to store mathematical functions in a file by parsing them into a linked list with a variable sized char ** array as my storage device. I have ran into problems with the memory management detail. The program crashes before output is flushed to the console, so printf() wasn't a debugging option. Neither is my actual debugger, since it seems to get a SIGTRAP every time. I have my warnings turned all the way up, but no errors or warnings are appearing. The part I know works is the actual code that opens the file and gets a line from the file. As far as the two functions that implement the linked list, that is most likely where the problem lies. My current attempt is basically to store the size of the dynamic array in the structure and keep resizing it until there are no more tokens. Then I will store the number of elements of the array in the structure and move on to the next node.
I'm parsing a text file, and I'd like to detect when a certain Compilation Condition - i.e. #ifdef - begins. The challenge is, that the condition can take any of the following patterns:
#ifdef (FLAG) #if defined (FLAG) #if (defined (FLAG))
(And perhaps I missed more)
I'd of course need to treat them all the same, as they are indeed the same. How would you know to treat them all the same?
I am parsing a binary data file by casting a buffer to a struct. It seems to work really well apart from this one double which is always being accessed two bytes off, despite being in the correct place.
If I attempt to print GROSS using printf("%f", row->GROSS) I get 0.0000. However, if I change the type of GROSS to char[8] and then use the following code, I am presented with the correct number...
Code:
typedef struct example { double d; }
example; example *blah = (example*)row->GROSS; printf("%f", blah->d);
I have a specific byte (that is unsigned char) array, that I need to find in a very big file (2GB or so), currently I'm using:
size_t fsFind(FILE* device, byte* data, size_t size) { int SIZE = (1024 > size) ? 1024 : size; byte buffer[SIZE]; int pos = 0; int loc = ftell(device);
[Code] ....
Which seems to find proper result on first use, but on subsequent searches it seems to find invalid positions, is there something I'm doing wrong, or is there a better way?
I am currently working out on a problem in which a c program is to be made which shows a large text file in parts. f For example: If file contains 200 lines. 50 lines will be shown on first page and user is asked to press any key to move to next page until EOF is found. user is allowed to return to previous page as well, and this is very complicated task for me. I tried to move cursor to a specific position using fseek etc but it page doesn't stop and reaches to end quickly.
Im trying to read and store several students information so that i can have an options menu where i can enter a student number and the program prints all the information stored about that student. The way i have it set up now, doesn't work for this because all info is reinitialized to stud1. Is there another way to store this info other than defining stud1, stud2,.....,stud200?
So im trying to parse a string into a Ip Address but i have a problem, the IPAddress.Parse method only works for ipv4 address's how do i parse ANY Ip address into a string, if i use the IPaddress.Parse method on my public(remote) IP it throws an exception but on ipv4 local ip it doesn't how do i parse ANY ip address the user inputs as a string as an Ip Address?
I have been here for almost 3 months looking for answers in my C++ problems.here's some type of code for this.
cout << "Enter value of x: " << endl; //Let's say 5. cin >> x; cout << "Enter equation: "; //Let's say x+1 cin >> equation;
Then the program analyzes that this character "x" has an initial value of 5.I already have the parser for the equation functions (+,-,*,/)This is the only thing lacking. Is there some type of function that i missed?
I have an input file that contains any number of lines. Each line will follow the same structure. There will be 5 fields, separated by 4 commas. Fields 1 and 3 will be single characters, fields 2,4,5 will be integers. Example:
<A, 123, B, 456, 789>
I need to iterate over each line, parse out the fields, and capture each field value into a separate variables.
Suppose I have read a line from an ASCII file with fgets(). Now I want to parse the line, which looks something like this: Code: # John Q. Public et al. 2014, to be submitted The name, "John Q. Public" is what I want. However, the name can be anything, consisting of 1 or more tokens separated by spaces. it could be "John" Or "John Public", or "Thurston Howell the 3rd", or etc... Bascially, I need to get the entire substring between the first hash mark, and the "et al" in the line. I tried this: Code: sscanf(line,"# %s et al.",name); But I can only get the first token (which, in this case, is "John").
I have to make a c++ program, in which with an algorithm I have to code a text from a file and write it to another file. The input should like this: "code forCoding.txt toBeWritten.txt" ; or like this: "decode toBeReadFor.txt toBeWrittenIn". I have done everything except one thig: It is says I have to be able to input parameter.
How should i write this? I read [URL] ....., but still dont get. The input of my program has to have 3 strings, so I guess argc should be 3, but I dont really get it. What should I have in my main about this parsing command line parameters?
For a rather complex and strange reason that I won't explain right now, I need to have this going on in my program.
class FVF{ private: vector<vector<float>> data; //Contains fvf data for Direct3D stuff public:
[Code].....
The FVF allows this Model3D class to also be compatible with file handling methods I've got, but here's the problem. D3D buffers require an array to feed them the information, and I know that for a single dimension of vector I can use vec.data(), how to do this for multiple dimensions.
I think the best Idea I've got so far is to set the vector within the Model3D class as a pointer, then I can union it with a float pointer... Once I can guarantee the information is correct and complete, manually transfer the contents of the vectors into the float pointer.. (The union is to reduce memory needed instead of having the data repeated in vectors and arrays)