Git Cherry Picking Across Forked Repos and Empty Commits

Recently I found myself in a situation where I wanted to bring in a specific upstream commit into a forked repository. Although these repos share a common history, the two repos had diverged enough that it wasn’t a straight-forward cherry-pick between branches. Instead, with clones of the two repositories I managed to cherry-pick as follows:

git --git-dir=..//.git format-patch -k -1 --stdout  | git am -3 -k

To complicate things further, a few days later, I found myself wanting to do the same thing, however, this time a submodule and another file had diverged enough that the patch no longer applied correctly. To get around this I had to:

git --git-dir=..//.git format-patch -k -1 --stdout  | patch -p1 --merge

Manually fix any of the still broken changes, then create a new commit with the changes.

These two stack overflow questions helped to work both of these issues out: https://stackoverflow.com/a/9507417 and https://stackoverflow.com/a/49537226

Finally, I’ve also in recent months found myself wanting to create a completely empty commit to kick off a downstream build process… much like you may touch a file to change its timestamp. To do this you can simply run:

git commit --allow-empty -m "Redeploy"

Merging a git repository from upstream when rebase won’t work

I use a lot of open source software in my research and work.

In recent months I’ve been modifying the source code of some of open source repositories to better suit my needs and I’ve contributed a few small changes back to the DeepLearning4J and the Snacktory projects.

This morning I’m starting to work on a further patch for the DeepLearning4J repository and I needed to bring my local repository up to date before committing the change. However, at some point over the past few months the DeepLearning4J repository has been rebased and my fork of it will no longer merge.
The usual approach for fixing this is to use the command:

git rebase upstream/master

However, for me this produces an error:

git encountered an error while preparing the patches to replay
these revisions:

As a result, git cannot rebase them.

Despite trying on two different computers similar errors occurred.

As I didn’t want to delete my entire repository and create a whole new fork of the upstream master this is the approach I took to fix the problem:

Backup the current master into a new branch:

git checkout -b oldMasterBackupBranch
git push origin oldMasterBackupBranch

Switch back to the master branch and replace it with the upstream master

git checkout master
git remote add upstream url/to/upstream/repo
git fetch hard upstream master
git reset --hard upstream/master

Push the updated master my github fork

git push origin master --force

This StackOverflow question helped a lot in working out this problem: Clean up a fork and restart it from the upstream

Git logs and commits across multiple branches

Like any good computer scientist I use git for many research and personal projects. My primary use of git is for code backups rather than collaborating with others. However, in some of my recent work I’ve been sharing repositories with colleagues and students which has caused me to improve my git skills.

The following is some of the functionality I’ve only recently discovered that has been extremely helpful:

git cherry-pick commit-id-number

This command proved very useful when I recently forked a github repo and made some changes to the source code for the specific project I’m working on. I soon discovered a bug in the original repository that a number of users had reported. I was able to fix the bug in my fork, but as my fork had changes that I didn’t want to contribute back to the original repository I was able to use the cherry-pick command to bring across only the specific commit related to the bug fix.

git checkout --theirs conflicted_file.php

Merge conflicts suck. But sometimes despite trying to pull as often as possible they still occur and can full your code with ugly messes to clean up. I recently wanted to throw away my changes to a file and simply use the latest committed file. By using git checkout –theirs I was able to throw away all my changes and go for the file that had been committed and conflicted with my changes. Conversely, you can use –ours to replace the conflicted file in favour of local changes.

git shortlog

During the past few weeks the students in the course I’ve been teaching this semester have used git to collaborate on group projects. The git shortlog command produces a list of commits grouped by each author allowing you to quickly see the relative rate at which people are contributing commits to a repository.

git branch -a

When you clone a remote repository it pulls in all branches from the remote repository. However, if you just type git branch you won’t see this. The -a flag allows you to see everything.

git log --all

The same issue applies when you are trying to see the log across all commits across all branches, just using the standard git log command will only produce the log for the current branch. The -all flag allows you to see the log across all branches, combining this with the cherry-pick command is very useful when you want to bring across just one set of changes rather than merging a whole branch.

git log --all --stat --author="Tom"

Bringing this all together I’ve begun to regularly use the above command to see all commits by a single user across all branches. This has been a good way to measure students’ contributions to a group project (note: the author option is case sensitive).

Code Structure, Random Number Generation and Conditional Probability

I had an interesting discussion with a group of students this afternoon which highlighted how important it is to understand conditional probability and how code structure can produce unexpected results.

The students’ code contained three lists and a random number generator. The random number generator was meant to randomly chose which list to take something from with equal probability (that is 1/3 chance of selecting from each of the lists). However, somehow one of their lists was being selected far more than the others.

Their code looked similar to the following:

if (random.nextInt(3) == 0) {
  // select from list 1
} else if (random.nextInt(3) == 1) {
  // select from list 2
} else {
  // select from list 3
}

Can you spot the bug? Why would this not generate equal probabilities of selecting from each list?

I wrote some of my own code to test this code structure almost 100 million times:
int count0 = 0;
int count1 = 0;
int count2 = 0;

// loop almost 100 million times
for (int i = 0; i < 99999999; i++) {
  if (r.nextInt(3) == 0) {
    count0++;
  } else if (r.nextInt(3) == 1) {
    count1++;
  } else {
    count2++;
  }
}

Which generated the results:
Count0 = 33333271
Count1 = 22221788
Count2 = 44444940

Which demonstrates this code doesn’t generate equal probabilities.

The problem is the second random number generation:

if (r.nextInt(3) == 0) {
  count0++; // this has 1/3 probability (0.333333333333333)
} else if (r.nextInt(3) == 1) { // this else will run 2/3 of the time
                                // and resolve true 1/3 of the times it runs

  count1++; // 1/3 * 2/3 this line will run... i.e. 2/9 probability (0.22222222222)
} else {
  count2++; // 2/3 * 2/3 this line will run... i.e. 4/9 (0.4444444444444)
}

Restructuring the code to only generate the random number once and storing this in a temporary variable makes a huge difference:

int rInt = r.nextInt(3); // only generate a random number once.
if (rInt == 0) {
  count0++; // this will run 1/3 of the time.
} else if (rInt == 1) {
  count1++; // this will run 1/3 of the time.
} else {
  count2++; // this will run 1/3 of the time.
}

And the results of this are:
Count0 = 33333907
Count1 = 33332586
Count2 = 33333506

Which looks much better.

(Note: alternatively you could make the second random number generation be out of 2 rather than 3 but this would be slower as the random number generator will take a few CPU cycles to calculate.)

Replacing titles with captions in WordPress gallery image links

Recently I have installed a colorbox plugin onto this blog for image galleries.

This plugin overrides the default behaviour of an image gallery and replaces individual image pages with a simple in-page pop up of the image. It works quite nicely, however, one problem I have had is that it extracts the title attribute from the former page links rather than the caption of the image that it is displaying.

I have spent the last few days trying to figure out a way to get the behaviour I want out of the plugin. However, this has been much more difficult that expected, the code that generates the galleries in WordPress is buried deep in the code and isn’t as straight-forward as replacing title=getTitle with title=getCaption. Trying to edit the behaviour of the jquery on the colorbox plugin isn’t an option either as it is heavily optimised. And updating every database row to replace the title with the caption also isn’t very¬†practical.

Instead the simple hack below will achieve what I want. It isn’t pretty and I am still looking for a better solution, but in the meantime after the page has been rendered the jquery quickly looks for galleries and replaces the titles in any links with the alt text from the image. This means that when an image loads in the colorbox it displays with a caption.

<script type="text/javascript">
jQuery(document).ready(function($) {
 $('.gallery img').each(function() { 
  $(this).parent().attr('title',$(this).attr('alt')); 
 });
});
</script>

Word of the day: Reabsorbsinged

Okay I got a lession in why you shouldn’t rush code on a Friday night.

This morning on my internship I was cleaning up some code I had written last friday and making sure everything worked as planned.

While checking some of the output of the program I came accross this word: Reabsorbsinged. I thought it looked a little odd and upon closer inspection of the code I found out why.

The word Reabsorbsinged is made up of one base word: absorb. From this you can combine prefix and suffix to build more words e.g. Reabsorb, Absorbs, Reabsorbs etc.

However in my blind coding last week I had failed to realise a cruicial mistake I had made when trying to take a shortcut. I had fed my input variable into a function and overwritten it at the sametime. This is a good trick if you want to minimise memory and you don’t need to worry if you input variable is overwritten.

Unfortunately I needed my input variable to stay intact to be able to generate the other words (like above). Instead I ended up with just one huge word: re+absorb+s+ing+ed So like you end up with dick of the day in some jobs I now have word of the day.

And on an entirely different note:

I am currently converting the code that generated this mistake from c++ to c#. Easy enough C# is pretty similar to java and doesn’t have pointers, yes! However as I discovered it doesn’t have a string reverse function either.

Glancing on the internet there are a few around pretty much going from extremely long and memory expensive, i.e. copying each character onto a new string at each step, or extremely quick but near impossible to read, understand or debug. So I got smart and wrote my on.

The code was along the lines of this:


string normalString = "abcdef";
char[] tempString = normalString.toCharArray();
for(int i=0; i < tempString.length / 2; i++) {
char tempc = tempString[i];
tempString[i] = tempString[tempString.length - i -1];
tempString[tempString.length - i - 1] = tempc;
}
string reversedString = tempString.ToString();

Two domains, one site, different pages served

I am currently working on a complex website that requires two different companies on two different domains to share the same website. I have just mananged to create a piece of php code to analyse the domain name typed in and redirect to the correct section of the site for that company. Awesome!

$url = $_SERVER['HTTP_HOST'];
if(strpos( strtolower($url), "site2" )!=FALSE) {
header('location:site2/index.php');
} else {
header('location:site1/index.php');
}

It is not often that code makes me happy.