Author Topic: number of columns/variables restricted in text file import?  (Read 13361 times)

Offline kari

  • Newbie
  • *
  • Posts: 5
    • View Profile
number of columns/variables restricted in text file import?
« on: December 09, 2010, 10:38:52 »
Hi - I am trying to import a text file with large number of columns (for some feature selection) but I am getting the error "Invalid data source format". A text file with exactly the same format but with only a few variables reads in well. Therefore, I wonder if there is an upper limit of the number of columns (variables) the text file can contain?

Kari Ruohonen

Offline Anders L Madsen

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 2282
    • View Profile
Re: number of columns/variables restricted in text file import?
« Reply #1 on: December 10, 2010, 11:02:36 »
Hi Kari,

There should be no upper limits on the number of variables.

Please specify exactly which function produces the error, e.g., loading data in learning wizard, in CPT view, ... . Also, if you can provide a simple example of your data file, then this would be very helpful.
HUGIN EXPERT A/S

Offline kari

  • Newbie
  • *
  • Posts: 5
    • View Profile
Re: number of columns/variables restricted in text file import?
« Reply #2 on: December 10, 2010, 12:47:18 »
Hi,
The error appears when loading data to the Learning wizard. Specifically, after browsing the file to load and pressing "next". I attach generated a data file containing random numbers with R and saved it as a text file in a proper format. The file has 2000 variables with 100 cases each and gives an error. Similar file but with 1000 variables and 100 cases for each loads without error (I could not attach the files since I was told that they are too large). However, after this exercise it looks like the number of variables is not the problem since with my original data file the error appeared somewhere around 700 variables with some variables containing text and not numbers (each variable had 72 cases here). Feels like it could be a maximum number of bytes reserved for the dimension of this operation or something? I am facing this with the 64-bit linux version.

Offline Anders L Madsen

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 2282
    • View Profile
Re: number of columns/variables restricted in text file import?
« Reply #3 on: December 10, 2010, 13:01:08 »
>Feels like it could be a maximum number of bytes reserved for the dimension of this operation or something?

The error message "Invalid data source format" does not indicate a memory issue. It suggests a data format error. The problem may, for instance, be related to the use of separator symbols in the data file. This would also explain why data is not read correctly. For instance, if your data has missing values, then it is necesary to use the correct symbol and to specify the right number of separator symbols.

Would it be possible to compress your two data files and send it by email to anders at hugin dot com?
HUGIN EXPERT A/S

Offline kari

  • Newbie
  • *
  • Posts: 5
    • View Profile
Re: number of columns/variables restricted in text file import?
« Reply #4 on: December 10, 2010, 13:26:42 »
I ruled out format issues since I have specified the format in exactly the same way for the example data sets - the only difference is the number of variables. Also, for my original file I saved slices of the full data set from R using the same format - still the errors. My artificial data does not include any missing values. My original data has missing values but Hugin reads them OK (coded as "N/A" or just null as suggested in the help files). But I will send you the example data files by email.

Offline Martin

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 613
    • View Profile
Re: number of columns/variables restricted in text file import?
« Reply #5 on: December 16, 2010, 12:17:53 »
Dear Kari

Thank you for reporting this bug. There was indeed a problem with the maximum size for some buffer. We will release an update soon.

Kind regards
Martin
Hugin Expert A/S