Hi Oscar,
> Usually, calculation softwares demand chemical structures be uploaded
> in individual files (one molecule per file). I´d like to know if
> OpenBabel can export this way from a file containing all structures.
If you type 'babel -H' (help), you'll find this:
-m Produces multiple output files, to allow:
Splitting: e.g. babel infile.mol new.smi -m
puts each molecule into new1.smi new2.smi etc
Alternatively, if you are using SDF files, and you have access to Perl on a command line, I can send you a fairly simple Perl program that will split an SDF file into individual records. It has the advantage that it doesn't alter the SDF records at all.
Craig
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
> Usually, calculation softwares demand chemical structures be uploaded
> in individual files (one molecule per file). I´d like to know if
> OpenBabel can export this way from a file containing all structures.
If you type 'babel -H' (help), you'll find this:
-m Produces multiple output files, to allow:
Splitting: e.g. babel infile.mol new.smi -m
puts each molecule into new1.smi new2.smi etc
Alternatively, if you are using SDF files, and you have access to Perl on a command line, I can send you a fairly simple Perl program that will split an SDF file into individual records. It has the advantage that it doesn't alter the SDF records at all.
Craig
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
- Pieces V1 03 – Combine Or Divide Complexly Structured Files Using
- Pieces V1 03 – Combine Or Divide Complexly Structured Files Smaller
- Pieces V1 03 – Combine Or Divide Complexly Structured Files Pdf
- Pieces V1 03 – Combine Or Divide Complexly Structured Files File
If you’ve got several Excel files that you need to merge (or worksheets in a workbook), you might be having a hard time working out how to do it. There are some useful features in Excel such as 'Consolidate' and 'Remove Duplicates' but these often don't quite hit the mark.
Surprisingly this solution is worked with 1.500.000 files. I did try two files which contains 10.000 and other file 1.500.000 lines. Although split function good for less than 10000 lines. My aim in writing this code is to use 100% managed code in VB.NET. Also the performance was not that bad. I was able to split the Windows 2000 sp3 which is 120mb into 10kb chunks in fairly about 6.5 min and the.net SP2 file which is 5.5 mb in less than a minute.
Pieces V1 03 – Combine Or Divide Complexly Structured Files Using
The first thing to work out is what type of merge you want to do. What you need from a merge can vary from situation to situation. Maybe you just need all the rows from each spreadsheet into one, consolidated, workbook. Or maybe your needs are more complex and you need to merge spreadsheets that have different formats, de-duplicating rows as you go. Some of the variations are discussed in the following sections so read on to find what you need.
Concatenate rows from multiple spreadsheets into one
This is probably the simplest type of merge where you’d like to add the rows from all the source spreadsheets to a single output spreadsheet. In this situation, all the worksheets to be merged will have the same columns in the same order.
An example of this type of merge is where you have several spreadsheets, each containing a single worksheet with columns A to F populated with:
- First Name
- Last Name
- Age
- Sex
- Marital Status
Meta movie 2 4 0 4. Each spreadsheet has this column structure and you’d like to concatenate the rows from each into just one sheet in one spreadsheet.
Pieces V1 03 – Combine Or Divide Complexly Structured Files Smaller
If this is what you need, you have are a few options, including:
- Copy & paste the rows from the source spreadsheets to the end of the rows
- Use a VBA (Visual Basic for Applications) macro to merge the spreadsheets
- Convert the spreadsheets to CSV files then concatenate them from the command line
Which option you choose really depends on how many spreadsheets you have to merge. More than a few and copy/paste becomes a real pain, as does converting them all to CSV files, so using VBA code may be your best option.
For step by step instructions on how to do each of these, including example spreadsheets, please have a look at How to merge Excel files with the same columns.
Merge spreadsheets with different columns
If the spreadsheets you’d like to merge have differing columns, just concatenating them together isn't much use as the data in the output spreadsheet will not be aligned. What you need to do in this case is map the columns from each spreadsheet onto the correct columns in the desired output spreadsheet.
Numi 3 24 coupon. As an example of this situation, imagine you have 3 spreadsheets, each containing a single worksheet:
- The first spreadsheet has columns A to D populated with: First Name, Last Name, Email, Age
- The second spreadsheet has columns A to D populated with: Email, First Name, Last Name, Sex
- The third spreadsheet has columns A and B populated with: Email, Date Of Birth
Here, you’d like to merge the data from each spreadsheet into just one sheet in one spreadsheet with the same set of columns, lets say we want columns A to F to be:
- First Name
- Last Name
- Age
- Sex
- Date Of Birth
This implies that the column mapping you need to perform is as follows:
To achieve this kind of mapping when merging spreadsheets, you will need to use VBA (Visual Basic For Applications). If you’d like to delve further into this approach, you might like to read How to merge Excel files with different columns.
Copy all sheets from separate workbooks into a single workbook
Sometimes, you have a number of separate spreadsheets and you want to copy all of the worksheets from each into single Excel workbook.
For example, you may have one workbook with 2 sheets, another workbook with 7 sheets in and a third with 3 sheets. The data is all related and you simply want to have all the sheets from the separate workbooks into a single workbook.
You have a couple of options for how to do this:
- Copy & Paste the worksheets into the master spreadsheet
Pieces V1 03 – Combine Or Divide Complexly Structured Files Pdf
- Use a VBA (Visual Basic for Applications) macro to merge the worksheets into the master spreadsheet
The copy and paste approach can be a good choice if there aren’t too many worksheets to copy but you’d get pretty tired of it when there are lots of worksheets. VBA code is probably your best bet when you have lots of worksheets to to copy.
Step by step instructions on how to do both of these options are, including example spreadsheets, are covered in How to merge worksheets from multiple Excel workbooks into one.
De-duplication of rows
Taking these approaches a step further, you may also want to de-duplicate the rows in the final spreadsheet. If duplicates can be identified as all the columns having the same values, there are several approaches you can use but if duplicates are identified based on just one or two columns (e.g. email address or first name, last name and date of birth), a good approach is to merge the spreadsheets using VBA. Your VBA code can merge the duplicate rows as each spreadsheet is processed. This is discussed in the section, De-duplicate using VBA code, below.
Remove Duplicates feature in Excel
Under the Data ribbon in the Excel menu, there’s an option called Remove Duplicates. This identifies duplicates as rows where the column values are the same and will remove all duplicate rows except one.
Select all the data in the spreadsheet then click Remove Duplicates and you're done.
Use Advanced Filters
Pieces V1 03 – Combine Or Divide Complexly Structured Files File
This method uses the built in Advanced Filters functionality of Excel to hide the duplicate rows. Again, this identifies duplicates as rows where the column values are the same. To use this feature, follow the steps below:
- Select all the data you want to hide duplicates in
- In the Data ribbon in the Excel menu, select the Advanced button in the Sort & Filter section
- Select the ‘Unique records only’ checkbox and click OK
De-duplicate using VBA code
This is the most flexible of the approaches as you can define how the de-duplication works and which values from each spreadsheet take precedence in the merged row. The concept is based on that discussed above, in Merge spreadsheets with differing columns.
- You build a matrix mapping the columns from each spreadsheet to the output columns
- Identify the column(s) which will be used to identify duplicates, the key column(s)
- In the VBA code, whilst adding rows, lookup the matching key columns in the output spreadsheet. If there is a match, merge the other columns into the matching row as appropriate
More information and step by step instructions on this technique, including example code and spreadsheets, can be found in How to combine multiple Excel files into one whilst merging row data.
Martin Judd heads up Joined-up Data, a product and service which saves you the headaches of merging and de-duplicating Excel files. If you want more detail on how to merge and de-duplicate Excel files, see The Ultimate Guide on How to Merge Excel Files.