Thursday, June 21, 2012

Moving from Livejournal (via Wordpress) to -> Blogger, a step-by-step guide


Introduction:

I know many people have had trouble doing this, particularly on the step requiring the Google Blog Converter like this one here. I finally got it to work for me after an entire day and trial and error, so I thought I'd share what worked for me. Hopefully this should help those of you, like me, who are uncomfortable using the command line, and hopefully it can work for you also.

Background:

I had a very large LiveJournal (~700 entries) that I wanted moved to Blogger
But because LiveJournal has no convenient export feature (other than month-by-month, which would have taken forever by hand), I first copied the LiveJournal to WordPress.
So I signed-up for a dummy WordPress blog, and used their "Import" option under the "Tools" setting.
This took quite awhile and a few tries to get it right, so make sure it's done properly before you continue.
From here, you can now "Export" the blog (I just used the default option, exporting entries, comments, etc). as a WordPress-specific .xml file. Save this on your hard drive.

Using Google Blog Converter on the command line:

There is a fabulous online conversion tool to convert between various blog formats, but the problem is that it won't handle files over 1MB. My blog, in WordPress .xml format, was around 7MB. Too big.
So I downloaded the latest .tar.gz source files from here

(Note: ---- here begins what I did in Ubuntu. I have no idea how this would work in Windows or Mac, though it is probably very similar----)

In Ubuntu, I right-clicked the downloaded file and chose "Extract Here". The file was extracted into a folder containing all the necessary blog converters. 
Now then, I opened my terminal.
First we need to tell the terminal to "change directory" into the folder we just created (when we extracted the .tar.gz). So for me, I needed to input:
cd ~/Downloads/google-blog-converters-r89
Note that you may need to change the last part depending on the folder's title (which will depend on which release version you download)
Next, just for simplicity and to follow the Readme file as closely as possible, I went back to my desktop and moved my WordPress-exported blog file (the .xml file) into the folder called "samples" inside the "google-blog-converters-r89" folder (the one that was created when I right-clicked and extracted the .tar.gz download.
Now then, back in the terminal, I typed:
bin/wordpress2blogger.sh samples/myjournalsample-wordpress.xml
Of course, you can substitute whichever blog converter you want in the first part (to find the correct name, look inside the "bin" folder inside the "google-blog-converters-r89" folder). And the second part should be the name of your WordPress-exported file. Mine was "myjournalsample-wordpress.xml"

Now if everything worked well, the terminal should chug away working and working, eventually stopping with a huge chunk of text visible (part of your entries). When that's done, you can move to the next section "Importing to Blogger." If it didn't work, you might have this problem that I ran into, featured below.

My problem using LiveJournal via WordPress

This step in the terminal would return an error, claiming the file is not a real WordPress document. It turns out that the reason was that the file included a bunch of LiveJournal metadata mixed-in with the WordPress data that couldn't be processed by the converter. So I had to remove these parts. I chose to use the "Find / Replace" feature of LibreOffice.

However, before I could do this, I had to add a special extension to LibreOffice to allow me to remove all instances automatically. I downloaded the "Alternative dialog Find & Replace for Writer" and installed it to LibreOffice using the "Extensions Manager" in LibreOffice's "Tools" menu (using the "Add..." button). I then restarted LibreOffice.

Once this extension was installed, I could "Find / Replace" what I needed. As it turns out, all of the ""  tags needed to be removed, along with all the meta data information (left over from the import from LiveJournal) stored between them. So, using the new extension's "Find / Replace" feature, I input this string in the "Find" box:
[::BigBlock::]
The "big block" regular expression can be found in the "Extended" menu of the dialog box also.

Then, leaving the "Replace" field blank, I clicked "Replace All"

This took awhile, as my blog was over 2000 pages in LibreOffice. But it worked-- all these tags and the unprocessable data between them were removed. 

Then I saved the document as plain text in LibreOffice, closed LibreOffice, and renamed the document extension from ".txt" -> ".xml" (I didn't do an actual conversion in LibreOffice because it seemed to keep locking up on me.)

Importing to Blogger

Earlier I had done this command:
bin/wordpress2blogger.sh samples/myjournalsample-wordpress.xml
It should work, leaving the resulting output file in your terminal window. But how do you get it out of there? Turns out, it's quite simple. Run the command again with a small addition, telling the terminal to dump the converted blog into a text file.

bin/wordpress2blogger.sh samples/myjournalsample-wordpress.xml > hopefully.txt 

Adding that last part, the "> hopefully.txt" meant that the dump of converted blog data would be written to a file called hopefully.txt

Finally I am done with the Terminal!

Now I go to my file manager and find this file, which for me was just inside the "google-blog-converters-r89" folder. I renamed it "hopefully.xml" (since that's really what it is) and went to Blogger.

I set-up a new blog there, went to the Options page, chose "Import" and selected the file.
It seemed to be uploading Ok for about 10 minutes....... but then.......... ERROR! Cannot Import the Blog!


Don't panic. After closing out of that dialog, I discovered that in fact it had imported all ~700 of my entries, WITH COMMENTS!!!!! They just remained unpublished.

The last step was to manually publish them all, made easier by displaying 50 posts at a time on the Blogger Posts page (you can display 100 at a time, but batch operations only work up to 50). So by displaying 50 entries per page, I had about 13 pages of blog posts, unpublished.
All that remained was to click the top-left checkbox to select all entries on the page, and click "Publish" and let it post them! Then move to the next 50, repeat...... etc........etc....... it took about just 5 minutes.

Now they are all posted on Blogger! All in their original dates! All with original comments! FINALLY

Conclusion

This was an incredibly difficult and mind-boggling process for me, so don't give up. I'm sure there is an easier or more efficient way than I've done it, but I surely didn't see anything online about it. Hopefully something in this guide can help you get out of a rut. Good luck!

No comments :

Post a Comment