C++ :: Parsing Through Large Data Sets

May 30, 2013

So I'm attempting to write a program that will parse through a large file (genome sequences) and I'm basically wondering what options I should consider if I wanted to either:

a) store the entire genome in memory and then parse through it
b) parse through a file in small portions

If I go with "a", should I just read a file into a vector and then parse through it? And if I go with "b" would I just use an input/output stream?

View 5 Replies


ADVERTISEMENT

C++ :: Parsing Large Text File

Feb 10, 2013

I have a massive text file containing many thousands of directory and file names with / at the root, like so:

/dir/
/dir/dir/
/dir/dir/dir/
/file
/dir/file
/dir/dir/file
/dir/dir/dir/file

I need to parse the file in such a way that I can create a filesystem hierarchy as if I were enumerating files/directories. Ultimately I want to add these to a tree gui control with everything under its proper node without duplicating anything. It should look roughly like so:

dir
-file
-dir
-file
-dir
-file

I can open the file and add nodes/children to the tree control but how should I go about doing the actual parsing? How can I find a filename and say "this belongs under this node"? I want to do this efficient as possible even if I must use multiple threads.

View 1 Replies View Related

C++ :: Parsing A Large File Into Smaller Units

Dec 16, 2013

I have a large binary file (84GB) that needs to be broken down into smaller file sizes (~1GB to 8GB) for analysis. The binary file is on a 32-bit machine and cannot be copied to another machine for analysis. The PC has Visual Studio 6.0 and is not upgradable. My issue is I'm using the following generic code to create the smaller files.

fseek(file, start, SEEK_SET);
end = start + (variable based on file size);
fseek(file, end, SEEK_SET);
for (i=start; i<end; i++) {
if(!feof(f)) {
byte = fgetc(f);
fputc(byte,new_file);
}
}

However, on a 32-bit machine, the iterator can only count up to ~2billion. Which means that I'm unable to copy anything past ~2GB. My original idea was to delete from the large binary file as I read from it so that I can reset the iterator on every read. However, I haven't come across a way to delete binary file entries.

Is there any other way that to break down a large binary file into smaller units? Or is there a way to delete binary file entries in sections or per entry?

On a 64-bit machine I could use _fseeki64. I've been reading that some versions of Visual 6.0 are capable of supporting 64-bit numbers but when using _fseeki64 or _lseeki64 on this machine its an "undeclared identifier"

View 7 Replies View Related

C++ :: Parsing Data From Text File?

Aug 26, 2014

Requirements in filtering the text file.

1. first my professor required me NOT to change the MAIN function(because he made it)

2. I have to make 3 getlogs() STRING FUNCTIONS:

a. string getlogs(); - accepts no paramters, SHOWS ALL THE CONTENTS OF TEXT FILE
b. string getLogs(const string & a); - accepts 1 parameter -SHOWS ONLY THE LINE WHICH CONTAINS THE SPECIFIED DATE FROM MAIN FUNCTION which is "2014-08-01"
c. string getLogs(const string & b, const string & c); - accepts 2 parameters, SHOWS ONLY THE LINES FROM THE DATE START to DATE END specified at THE MAIN FUNCTION which is date start-"2014-08-01";DateEnd = "2014-08-10";

3. all COUT should be in the MAIN FUNCTION

TEXTFILE CONTAINS:
2014-08-01 06:13:14,Name,4.5,CustomUnit,CustomType
2014-08-02 06:13:14,Name,4.5,CustomUnit,CustomType
2014-08-03 06:13:14,Name,4.5,CustomUnit,CustomType
2014-08-04 06:13:14,Name,4.5,CustomUnit,CustomType

[Code] .....

my codes so far:

//MAIN
#include <iostream>
#include <string>
#include <fstream>
#include <dirent.h>

[Code].....

View 1 Replies View Related

C/C++ :: Parsing String Data Separated By Pipe Delimit

Mar 7, 2014

i tried to parse the string data seperated by Pipe('|') delimiter, here i am getting some error.Please find the below code.

[char* getData(){
char* string = "1355|||250|New";
char* tok1[10],tok2[10],tok3[10],tok4[10],tok5[10];
sscanf(string,"%[^'|'],%[^'|'],%[^'|'],%['^|'],%s",tok1,tok2,tok3,tok4,tok5);
printf("%s %s %s %s %s",tok1,tok2,tok3,tok4,tok5);
}]

I want to print the Value 250 in my string, but it was displaying some garbage values.

View 1 Replies View Related

C :: Parsing Tokens In File Lines Into A Linked List Of Data

Sep 20, 2013

I have been attempting to store mathematical functions in a file by parsing them into a linked list with a variable sized char ** array as my storage device. I have ran into problems with the memory management detail. The program crashes before output is flushed to the console, so printf() wasn't a debugging option. Neither is my actual debugger, since it seems to get a SIGTRAP every time. I have my warnings turned all the way up, but no errors or warnings are appearing. The part I know works is the actual code that opens the file and gets a line from the file. As far as the two functions that implement the linked list, that is most likely where the problem lies. My current attempt is basically to store the size of the dynamic array in the structure and keep resizing it until there are no more tokens. Then I will store the number of elements of the array in the structure and move on to the next node.

Here is my text file I use :

Code:
sqrt( 25 ); pow( 6 ); sin( 2 );
pow( 4 ); tan( pow( 2 ) ); Main.c :

Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct file_data

[Code]...

View 7 Replies View Related

C :: Parsing Binary Data File By Casting A Buffer - Accessing Double From Struct

Jan 4, 2014

I am parsing a binary data file by casting a buffer to a struct. It seems to work really well apart from this one double which is always being accessed two bytes off, despite being in the correct place.

Code:

typedef struct InvoiceRow {
uint INVOICE_NUMBER;
...
double GROSS;
...
char VAT_NUMBER[31];
} InvoiceRow;

If I attempt to print GROSS using printf("%f", row->GROSS) I get 0.0000. However, if I change the type of GROSS to char[8] and then use the following code, I am presented with the correct number...

Code:

typedef struct example {
double d;
}

example;
example *blah = (example*)row->GROSS;
printf("%f", blah->d);

View 7 Replies View Related

C/C++ :: Storing And Retrieving Large Amount Of Data

Nov 28, 2012

I have to store large amount of data and retrieve the same data then write into file in C++. Currently I am using vector to store and retrieve. But vector is taking more time to store and retrieve the element. Is any other best data structure to store and retrieve large amount of data in unordered way?

Example code:

int I1 = 700,I2 = 32, I3 = 16;
//declare and resize the vector size
vector< vector < vector < vector<DOUBLE> > > > vPARAM; 
vPARAM.resize(I1, vector< vector < vector<DOUBLE> > >

[Code] ....

View 3 Replies View Related

C++ :: Using Smart Pointers To Sort And Relink Potentially Large Data Elements

Feb 20, 2014

I am trying to use smart pointers to sort and re-link potentially large data elements. I have defined a class in my code for smart pointers, as listed below:

template <typename T>
class sptr {
public:
sptr(T *_ptr=NULL) { ptr = _ptr; }

[Code] ....

The object I'm trying to sort is a singly-linked list containing class objects with personal records (i.e., names, phone numbers, etc.). The singly-linked list class has an iterator class within it, as listed below.

template <class T>
class slist {
private:
struct node {
node() { data=T(); next=NULL; }

[Code] .....

The following is my function within my list class for "sorting" using the smart pointers.

template <typename T>
void slist<T>::sort(){
vector< sptr<node> > Ap(N); // set up smart point array for list
//slist<T>::iterator iter = begin();
node *ptrtemp = head->next;

[Code] .....

I must have a bad smart pointer assignment somewhere because this code compiles, but when I run it, std::bad_alloc happens along with a core dump. Where am I leaking memory?

View 6 Replies View Related

C++ :: Compare Function On Sets

Oct 18, 2014

How can I declare a set of strings that i want to be ordered by a boolean function Comp? Like the one you would write in

sort(v.begin(), v.end(), comp);

I tried
set<string, Comp> S;
but it doesn't work

And the same question for priority queues, can you specify the function you want your elements to be sorted by?

View 4 Replies View Related

C++ :: Removing From A Vector Of Sets?

Apr 18, 2014

I have a vector of sets in which I wish to remove a set from the vector, if it contains a certain value, is there any way of doing this?

for(int j = 0; j <= clauseVector.size(); ++j){
if(clauseVector[j].find(find) != clauseVector[j].end())
std::cout << "FOUND: " << propagator << "
";
}
}

This is what I have been using to find the element, but is there a way to remove the set that contains the element?

If needed I can include the full code

View 11 Replies View Related

C :: Function That Scans In Two Sets Of Coordinates

Nov 5, 2013

Night now I'm working on part of a function that scans in two sets of coordinates. Is there a way I can scan both sets of coordinates in with the same scanf?

ex. scanf("%d%d%d%d," &destroyer_1, &destroyer_2, &destroyer_3, &destroyer_4)

but that didn't work...

Code:
printf("What are the first coordinates for destroyer:
");
scanf_s("%d%d", &destroyer_1, &destroyer_2);
gameboard[destroyer_1][destroyer_2] = 'd';
printf("What are the second coordinates for destroyer:

[Code] .....

View 5 Replies View Related

C++ :: Add And Subtract Two Sets Of Integer Arrays?

Apr 24, 2013

Ok so I am currently trying to add and subtract two sets of integer arrays. When I run my program, the program does not subtract the numbers from the first set with the numbers from the second set. Could anyone here take a look at my code and help me figure out what the problem is?

#include <iostream>
#include "SafeArray.h"
using namespace std;

[Code].....

View 1 Replies View Related

C++ :: Merging Sets In Linear Time

Jan 5, 2014

I have two std::sets S1 and S2 which I need to merge.

I know that the largest element of S1 is less than the smallest element of S2.

the additional information about S1 and S2 should enable me to do the insertion of each element in costant time instead of log(N).

what is the fastest way to calculate the union of the two sets?

View 6 Replies View Related

C++ :: Removing Part From A Vector Of Sets?

Apr 18, 2014

I have a vector of sets, which I am removing any element which contains a certain value. For example, if I was looking for 2:

[0] 1 2 3
[1] 4 5 6

After the program was run, I would be left with just [0]4 5 6.

This is the code I have been using

auto iter = std::remove_if( clauseVector.begin(), clauseVector.end(),[propagator] ( const std::set<int>& i ){
return i.find(propagator) != i.end() ; } ) ;
clauseVector.erase( iter, clauseVector.end() ) ;

I want to know, is there any way I can tweak this code so that it only removes one part of the set rather than the whole thing. For example with above example, I would be left with

[0] 1 3
[1] 4 5 6

View 4 Replies View Related

C++ :: Running Certain Code In Codeblocks Sets Off AVG?

Jul 3, 2012

I've recently started to learn C++ and I'm using codeblocks as my IDE, but I keep getting problems with AVG free edition picking up random pieces of code as Trojans ?! I've put an example of some code that sets it off below, and the error message I get. Is there anyway I can set AVG not to trigger with any codeblocks coding I've done? I guess I could tell AVG not to trigger for that folder, if that's even possible?

Code:

#include <iostream>
using namespace std;
class MyClass{
public:
void coolSaying(){
cout << "BUST A MOVE!!" << endl;

[Code] ....

Error I get:

File name: h:DesktopC++ projectsclasses and objects 1 inDebugclasses and projects 1.exe
Threat name: Trojan horse Agent3.BMSZ
Detected on open.

View 14 Replies View Related

C++ :: Find Intersection / Union And Difference Of Two Sets

Feb 17, 2014

The program is to find intersection,union and difference of two sets. The program take the input correctly but after it crashes with the message that some exe is not working...

Code:
#include<iostream>
using namespace std;
void Input(int *A, int*B, int size1, int size2)
//input function {

[Code] ....

View 3 Replies View Related

C++ :: Maximum Numbers Of Elements That Can Be Eliminated From A Set Of Sets?

Nov 17, 2014

I have come across a problem lately. You are given a set of n sets with m variables.. for instance {a,b,c,d}, {b,c,d}, {b,c}, {c,e,f}, {e,f}. And you want to eliminate elements from these sets with the restriction that you can only eliminate one item from each set and each item can only be eliminated from one set (i.e. if you've eliminated b from set {a,b,c,d}, then you cannot eliminate it from {b,c,d}). The problem is writing an algorithm which determines the maximum number of elements you can eliminate. And I'm hopelessly stuck... of course, you could backtrack it and determine this number but I feel it could be done more efficiently..

View 2 Replies View Related

C++ :: Function That Computes Result Of Formula For Two Sets Of Inputs

Nov 14, 2013

I'm currently writing a program and a portion of it needs to have a function that computes the result of a formula for two sets of inputs. It then has to calculate and display the difference between the two results.

I have code that takes the input from the user and converts it to the type that I need. In this case we are talking about latitudes and longitudes that will be converted to degrees, then to xyz coordinates.

View 1 Replies View Related

C# :: Parsing String Into IP Address

Oct 1, 2012

So im trying to parse a string into a Ip Address but i have a problem, the IPAddress.Parse method only works for ipv4 address's how do i parse ANY Ip address into a string, if i use the IPaddress.Parse method on my public(remote) IP it throws an exception but on ipv4 local ip it doesn't how do i parse ANY ip address the user inputs as a string as an Ip Address?

View 5 Replies View Related

C++ :: Parsing String To Contain Inputted Int

Sep 10, 2014

I have been here for almost 3 months looking for answers in my C++ problems.here's some type of code for this.

cout << "Enter value of x: " << endl; //Let's say 5.
cin >> x;
cout << "Enter equation: "; //Let's say x+1
cin >> equation;

Then the program analyzes that this character "x" has an initial value of 5.I already have the parser for the equation functions (+,-,*,/)This is the only thing lacking. Is there some type of function that i missed?

View 6 Replies View Related

C# :: Parsing XML File Properly

Apr 11, 2014

I'm parsing an xml file full of payslips and using the data in another application. I've got it all working but I suspect it isn't the most elegant piece of code. I run through the xml file finding a series of "Text" attributes/elements" and then I run through it again finding a series of "Field" attributes/elements, Here is a sample of the code:

For getting the "Text" fields :

XNamespace ns = "urn:crystal-reports:schemas:report-detail";

//
// Get all the Text attributes & Elements
//
foreach (XElement xtxt in xdoc.Descendants(ns + "Text"))

[Code]...

This works fine, I extract all the data I'm interested in and go to do my thing with it. However I really need to know when each record ends and I was doing that by looking for "Text24" in the text fields and "EeRef2" in the field fields, which wasn't very elegant in the first place. Then a "Text16" was added to end of each record which was fine I could just look for "Text16" but now it's apparent that "Text16" isn't always there. I've got it all working for now but I'd prefer to process one record at a time i.e. extract all the "Text" & "Field" values for one record, do whatever I need to do with it, update the xml file to indicate this progress ( if possible ) and then move on to the next record. I've attached a sample of the xml but basically is has the following structure :

<Details>
<Section>
<Text></Text>
"
<Field></Field>

[Code]...

So a record is everything between the first <Details> and the last </Details> with two <Details> and two </Details> in between.

View 2 Replies View Related

C :: Parsing String And Capturing In Variables

Nov 3, 2013

I have an input file that contains any number of lines. Each line will follow the same structure. There will be 5 fields, separated by 4 commas. Fields 1 and 3 will be single characters, fields 2,4,5 will be integers. Example:

<A, 123, B, 456, 789>

I need to iterate over each line, parse out the fields, and capture each field value into a separate variables.

header file: Code: #include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>

[Code]....

I've used similar code to this before, but this time I'm getting the dreaded pointer from integer error on my strtok lines:

warning: passing argument 2 of 'strtok' makes pointer from integer without a cast'

View 2 Replies View Related

C :: Parsing A Line To Extract A Particular Substring?

Mar 14, 2014

Suppose I have read a line from an ASCII file with fgets(). Now I want to parse the line, which looks something like this: Code: # John Q. Public et al. 2014, to be submitted The name, "John Q. Public" is what I want. However, the name can be anything, consisting of 1 or more tokens separated by spaces. it could be "John" Or "John Public", or "Thurston Howell the 3rd", or etc... Bascially, I need to get the entire substring between the first hash mark, and the "et al" in the line. I tried this: Code: sscanf(line,"# %s et al.",name); But I can only get the first token (which, in this case, is "John").

View 2 Replies View Related

C++ :: Parsing Command Line Parameters

May 19, 2013

I have to make a c++ program, in which with an algorithm I have to code a text from a file and write it to another file. The input should like this: "code forCoding.txt toBeWritten.txt" ; or like this: "decode toBeReadFor.txt toBeWrittenIn". I have done everything except one thig: It is says I have to be able to input parameter.

How should i write this? I read [URL] ....., but still dont get. The input of my program has to have 3 strings, so I guess argc should be 3, but I dont really get it. What should I have in my main about this parsing command line parameters?

View 4 Replies View Related

C++ :: Parsing Multidimensional Vector Into Array

Nov 30, 2013

For a rather complex and strange reason that I won't explain right now, I need to have this going on in my program.

class FVF{
private:
vector<vector<float>> data; //Contains fvf data for Direct3D stuff
public:

[Code].....

The FVF allows this Model3D class to also be compatible with file handling methods I've got, but here's the problem. D3D buffers require an array to feed them the information, and I know that for a single dimension of vector I can use vec.data(), how to do this for multiple dimensions.

I think the best Idea I've got so far is to set the vector within the Model3D class as a pointer, then I can union it with a float pointer... Once I can guarantee the information is correct and complete, manually transfer the contents of the vectors into the float pointer.. (The union is to reduce memory needed instead of having the data repeated in vectors and arrays)

how I could just pass these as arrays?

View 4 Replies View Related







Copyrights 2005-15 www.BigResource.com, All rights reserved