Error reloading study: "This file does not appear to be a valid .zip file."

Study Forum (Inactive)
Error reloading study: "This file does not appear to be a valid .zip file." craig.m  2012-03-23 11:51
Status: Closed
 
Hey all,

You probably figured out my problem by the title: I'm trying to reload a study with a zip archive of data files, but the LK interface is reporting: "This file does not appear to be a valid .zip file."

The message is nondescript enough so that I don't know if it is specifically having a problem with the compression or with the contents of the file. Either way, I did my best to conform to the software's expectations. I'm using the CLI `zip` tool on Ubuntu, with the files organized in the archive in the exact same way they are in an export downloaded zip archive, and I've played with the option that (allegedly) forces the archive's compression method to not be zip64. Nothing gets around the aforementioned error.

Any pointers would be appreciated.

Thanks,
Craig
 
 
jeckels responded:  2012-03-23 13:16
Hi Craig,

Looking at the code, we'll show this error message if there's any kind of problem reading the file. That could include a .zip format that we don't understand, or more general issues like permission errors. In addition to needing read permission for the file, the server will try to extract the contents a new subdirectory that's created next to the study archive itself, so it will need to be able to create directories, etc.

Also, are you uploading the file as part of the request, or have you browsed to it through the regular file management UI?

Thanks,
Josh
 
craig.m responded:  2012-03-23 13:22
Hi Josh,

I'm trying to upload the file on the "Reload Study" (importStudy.view) page. I'm using the "Reload Study From Local Zip Archive" feature.

So if I understand you correctly, I need to determine where the file is being uploaded to, where the archive is going to be expanded, and confirm that this is a location where the process has appropriate permissions to do this work. And still, this might end up being an issue with the zip format, or anything else in the "doesn't work" category. It's unfortunate that the error message is so generic. Do you recommend some first steps for troubleshooting?

Thanks,
Craig
 
marki responded:  2012-03-24 09:29
Hi Craig, here are a couple of ideas.
1) If it is a file system config option no file will upload. You could download our demo study and try that. If it works, then it is likely a problem with the file itself. The demo study should be here: https://www.labkey.org/wiki/home/Documentation/page.view?name=setupDemoStudy

2) If it is a problem with the file itself it would have to be very early on in processing. There may be something in the logs -- if not I can create a private drop location. You could upload either the file with confidentiality or a version with only header lines in the datasets. I could then do the upload privately in a debugger to find the exact issue.
 
craig.m responded:  2012-03-26 15:27
Mark,

Thanks for the tips. I am not getting the demo study to work, so I suppose that I am doing something wrong on the permissions side. Specifically, I am probably setting up a pipeline root with inappropriate permissions.

Similarly, I can't find where the link is to just import a study (the documentation doesn't seem to match the current actual version of the software), so that could also be something I am doing incorrectly. (I.e., I'm trying to reload through a pipeline instead of just import the data. I don't need to set it up for reloading, as the study is basically finished. Also, this is only an issue because the process of setting up a study won't allow populating the data from a file if the data requires an additional key; I have no problems getting the data with one-to-one relationships loaded, but the one-to-many data appear to require a manual configuration and importing, which is where I'm stumbling.)

Any pointers?
 
marki responded:  2012-03-27 08:59
Couple of things.

1) It's possible to import a file with an extra key without the study load thing. When you create new dataset don't check the "create from file' option. This will take you to the full define dataset page. From there you can actually define the set of fields using "Infer Fields from File" then set the "Additional Key Column" at the top of the page. You can then import the file after you Save. (View Data, then Import)

2) To import from a zip the button is at the bottom of the Manage Study page. If the zip is on your local machine you upload it by first browsing for the file & using the "Browse" button then the Load from Archive button. But I think this is probably what you were doing.

I do think that if there is a perms/config problem on the server it will come back to bite you later so we might want to track that down.
 
jeckels responded:  2012-03-27 09:56
Hi Craig,

To determine the directory the server is using for storing these files, go to the folder where you're importing the study. Go to Admin->Go To Module->Pipeline, or Admin->Go To Module->More Modules->Pipeline. Click the Setup button.

Unless you've manually configured it to point elsewhere, the folder will be using the "Use a default based on the site-level root" option. It will show the path it's using. Ensure that the OS user that is executing the Tomcat process has permission to read/write/create subdirectories, etc for that location.

If the "Set a pipeline override" option is selected instead, check the file permissions on the "Primary directory" path.

Thanks,
Josh
 
craig.m responded:  2012-03-27 10:29
Thanks the help, all. I think I've narrowed down the problem to permissions with the pipeline root, and now the errors I'm encountering are more banal (poor formatting of a couple of tsv files). Right now the pipeline is running with the latest version of the data upload and my fingers are crossed.

If I can offer some quick feedback, there was some confusion on my side from what appears to be the interchangeable use of the terms "reload" and "import". I've gathered that they mean the same thing -- pushing a flat data file into a table in the database -- but the multiple points of entry into this functionality (and using multiple terms to describe them) left me a little bit confused about how the whole system actually works.

Thanks again.