HomeAboutSolutionsTechnologyCareersNews & EventsContact Us

Programming puzzles, problems and riddles are a passion at ITA. You may have seen some of our puzzles in recent advertising campaigns around Boston. We use them to help us recruit people who, like us, enjoy the challenge of designing and implementing solutions to interesting computing problems. 

A good way to get your resume onto the top of the pile is to tackle one of the programming puzzles below and send us your solution. Here are a few tips to help you get started:
        

  • Unless otherwise specified, you may use any language you like. 
  • Please send your puzzle answer and resume to careers@itasoftware.com. In your email, along with your source code, please include your program's final answer; a brief (1-2 paragraph) description of your approach and any trade-offs you made (say, for generality, speed, or ease of implementation); and instructions on how to test your program.
  • Puzzle answers will be reviewed only for individuals submitting job applications.
  • No pseudo-code please!

Click the links below to see puzzles for the following positions:
Computer Scientist/Software Engineer
Java Web Application Developer
Server API Developer


Computer Scientist/Software Engineer Puzzles:

 Palindromic Pangram
A palindrome is a sequence of words like "lid off a daffodil" or "shallot ayatollahs" that uses the same letters reading backwards as forwards. The words need not form a meaningful or grammatical sentence.

A palindromic pangram is a multi-word palindrome that includes all 26 letters of the alphabet. Find the shortest sequence of words that is both a pangram and a palindrome. Use this dictionary: WORD.LST (1.66MB).


  BitVector Genealogy

The BitVectors are an ancient and immortal race of 10,000, each with a 10,000 bit genome. The race evolved from a single individual by the following process: 9,999 times a BitVector chosen at random from amongst the population was cloned using an error-prone process that replicates each bit of the genome with 80% fidelity (any particular bit is flipped from parent to child 20% of the time, independent of other bits).


Write a program to guess the reproductive history of BitVectors from their genetic material. The randomly-ordered file bitvectors-genes.data.gz contains a 10,000 bit line for each individual. Your program's output should be, for each input line, the 0-based line number of that individual's parent, or -1 if it is the progenitor. Balance performance against probability of mistakes as you see fit.


To help you test your program, here is a much smaller 500 x 500 input dataset: bitvectors-genes.data.small.gz, along with its solution file: bitvectors-parents.data.small.


 Roll Your Own Chat Server

Implement a simple standalone TCP-based chat server, using the following protocol:

The server responds to all commands with either:
    OK<CRLF>
or, when an error occurred, with:
    ERROR <reason><CRLF>

<CRLF> indicates the bytes '\r\n'.

Upon connecting to the server, the client sends:
    LOGIN <username><CRLF>
Clients can create new chatrooms or join existing chatrooms (chatrooms begin with the character '#') by doing:
    JOIN #<chatroom><CRLF>
Clients can leave chatrooms by sending:
    PART #<chatroom><CRLF>
Clients can be in multiple chatrooms at once.

Clients can send a message to a chatroom:

    MSG #<chatroom> <message-text><CRLF>

Clients can send a message to another user:
    MSG <username> <message-text><CRLF>
When a message is sent to a chatroom the user is in, the server sends to the appropriate client:
    GOTROOMMSG <sender> #<chatroom> <message-text><CRLF>








or if the message is sent directly from one user to another:
    GOTUSERMSG <sender> <message-text><CRLF>
Finally, the client can log off by sending:
    LOGOUT<CRLF>
Here's a transcript of a sample session where a user named "alice" joins a chatroom called #news after connecting. C indicates the line was sent by the client, S indicates it was sent by the server (end of line indicates CRLF was sent):
    C: LOGIN alice 
S: OK
C: JOIN #news
S: OK
C: MSG #news hi everyone
S: GOTROOMMSG bob #news nothing much happened after that
S: OK
S: GOTROOMMSG alice #news hi everyone
S: GOTUSERMSG carol hi alice, where've you been?
C: MSG carol on vacation
S: OK
C: LOGOUT
<server closes connection>
When implementing the server, aim for scalability and robustness. (Many submissions fail due to lack of robustness!) Your submission should include a description of the steps you took towards those two goals. Keep in mind that the client may be buggy, or even malicious. For example, if a client connects to the server and sends an infinite stream of the byte 'X' with no line break, the server should deal with this case gracefully. Please do not use an existing networking framework (e.g., Twisted or asyncore for Python, ACE for C++, etc.) to implement the server. The server should support running on Linux.


 Word Rectangle
Write a program to find the largest possible rectangle of letters such that every row forms a word (reading left to right) and every column forms a word (reading top to bottom). Words should appear in this dictionary: WORD.LST (1.66MB). Heuristic solutions that may not always produce a provably optimal rectangle will be accepted: seek a reasonable tradeoff of efficiency and optimality.


 Strawberry Fields

Strawberries are growing in a rectangular field of length and width at most 50. You want to build greenhouses to enclose the strawberries. Greenhouses are rectangular, axis-aligned with the field (i.e., not diagonal), and may not overlap. The cost of each greenhouse is $10 plus $1 per unit of area covered.

Write a program that chooses the best number of greenhouses to build, and their locations, so as to enclose all the strawberries as cheaply as possible. Heuristic solutions that may not always produce the lowest possible cost will be accepted: seek a reasonable tradeoff of efficiency and optimality.

Your program must read a small integer 1 ≤ N ≤ 10 representing the maximum number of greenhouses to consider, and a matrix representation of the field, in which the '@' symbol represents a strawberry. Output must be a copy of the original matrix with letters used to represent greenhouses, preceded by the covering's cost. Here is an example input-output pair:

Input   Output
4
..@@@@@...............
..@@@@@@........@@@...
.....@@@@@......@@@...
.......@@@@@@@@@@@@...
.........@@@@@........
.........@@@@@........
                 
90
..AAAAAAAA............
..AAAAAAAA....CCCCC...
..AAAAAAAA....CCCCC...
.......BBBBBBBCCCCC...
.......BBBBBBB........
.......BBBBBBB........
In this example, the solution cost of $90 is computed as (10+8*3) + (10+7*3) + (10+5*3).

 

Run your program on the 9 sample inputs found in this file and report the total cost of the 9 solutions found by your program, as well as each individual solution.



  Sling Blade Runner
"How long a chain of overlapping movie titles, like Sling Blade Runner, can you find?"

Use the following listing of movie titles: MOVIES.LST. Multi-word overlaps, as in "License to Kill a Mockingbird," are allowed. The same title may not be used more than once in a solution. Heuristic solutions that may not always produce the greatest number of titles will be accepted: seek a reasonable tradeoff of efficiency and optimality.
 

Data provided by MovieLens at the University of Minnesota.



  Word Numbers
 "If the integers from 1 to 999,999,999 are written as words, sorted alphabetically, and concatenated, what is the 51 billionth letter?" 

To be precise: if the integers from 1 to 999,999,999 are expressed in words (omitting spaces, 'and', and punctuation[1]), and sorted alphabetically so that the first six integers are

  • eight
  • eighteen
  • eighteenmillion
  • eighteenmillioneight
  • eighteenmillioneighteen
  • eighteenmillioneighteenthousand
and the last is
  • twothousandtwohundredtwo
then reading top to bottom, left to right, the 28th letter completes the spelling of the integer "eighteenmillion".

 

The 51 billionth letter also completes the spelling of an integer. Which one, and what is the sum of all the integers to that point?

[1] For example, 911,610,034 is written "ninehundredelevenmillionsixhundredtenthousandthirtyfour"; 500,000,000 is written "fivehundredmillion"; 1,709 is written "onethousandsevenhundrednine".




 The O'Hare Affair

Objective You want to meet a friend at O'Hare airport at noon and return home the same day. Given a date and an origin airport, find the pair of nonstop flights that gets you to Chicago and home again on that date, minimizing your total time away from home, subject to the constraint that you are at O'Hare at noon.

Details Your program, which must be written in Java, will work by scraping our http://matrix.itasoftware.com website. Create an account on that site called "candidate_YourName" for your program to use. (Please include your name and email address in your registration in case we need to contact you.) Your program should accept two command-line inputs as follows: The program should log in to matrix.itasoftware.com with your candidate_YourName account and pose a round-trip query between the given airportCode and O'Hare (ORD), limited to nonstop flights only, using BOS as the sales city, with appropriate date and time constraints. The program should then scan the resulting solution set for solutions that meet the objective specified above. No more than 15 http requests should be made of matrix.itasoftware.com per invocation. Finally, details of the three shortest solutions (breaking ties arbitrarily) should be written out in human-readable form to the console.

 

For example, for a traveler from Baltimore on July 4th, your program might be invoked as follows: Sample output:

 

Strive to make your code clean and robust, and to report errors cleanly. Be sure to include your source, any necessary libraries, and instructions on how to run it.

 

java -jar OHareAffair.jar 2006-07-04 BWI

 

java -jar OHareAffair.jar YYYY-MM-DD airportCode

 

Travelling round-trip from BWI to ORD on 2006-07-04:

Trip length: 6:21
Outbound: American Airlines Flight AA3991 (10:03a-11:04a)
Return: United Airlines Flight UA1236 (1:30p-4:24p)

Trip length: 6:39
Outbound: United Airlines Flight UA641 (9:45a-10:47a)
Return: United Airlines Flight UA1236 (1:30p-4:24p)

Trip length: 6:54
Outbound: American Airlines Flight AA3991 (10:03a-11:04a)
Return: American Airlines Flight AA4009 (2:09p-4:57p)



Back to top

 

Java Web Application Developer Puzzle:

 Instant Search

Write a Java web application which provides "instant search" over properties listed in the National Register of Historic Places. Rather than waiting for the user to press a submit button, your application will dynamically update search results as input is typed. We provide the file nrhp.xml.gz, which contains selected information from the register's database.

Database
The key component of your server-side application is an efficient, in-memory data structure for looking up properties (written in pure Java). A good solution may take several minutes to load, but can answer a query in well under 0.1 ms on a modern PC. (Note that a sequential search of all properties is probably too slow!) An input matches a property if it is found at any position within that property's names, address, or city+state. Matches are case-insensitive, and consider only the characters A-Z and 0-9, e.g. the input "mainst" matches "200 S Main St" and "red" matches "Lakeshore Dr." Note that the server's JVM will be configured with 1024M maximum heap space. Please conform to the interfaces specified in nrhp.jar when creating your database.

Servlet
Your servlet should accept an input string as the request parameter to a GET request. Results should include the information for a pre-configured number of properties (e.g. 10), the total number of matches which exist in the database, and the time taken by your search algorithm. Your servlet should be stateless, ie. not depend on any per-user session information. Paginate your additional results as a bonus!

Client Your web page should access the servlet using JavaScript's XMLHttpRequest object. As the user types, your interface should repeatedly refine the list of search results without refreshing the page.

Please submit a WAR file, configuration instructions, your source code, and any comments on your approach. Your application will be tested with Tomcat on Sun's 64-bit J2SE and a recent version of Firefox.

Resources XMLHttpRequest, StAX, Apache Tomcat, Mozilla Firefox, Eclipse


Back to top

 

Server API Developer Puzzle:

 Fantasy Baseball

Given this redsox-1996-2005.xml with career hitting statistics for every Red Sox player with at least 100 at bats in the past 10 years, generate an XML file with the best possible batting lineup. The purpose of this puzzle is to show that you know how to use advanced features of XML, so you should use XML, XPath, and XSL wherever possible. For example, a solution might be a set of XSL transforms, optionally with a small script to apply the transforms. The solution must adhere to the following constraints:

  • The lineup must be comprised of 9 {player, year}. This means there are 9 spots in the lineup, and each spot actually refers to a player IN A PARTICULAR YEAR. Note that it's possible for the same real life player to occupy multiple spots in the lineup, as long as each is in a different year. Note also that it's possible for a {player, year} to be split between multiple teams.
  • The lineup must account for all 8 positions, plus a designated hitter. Each {player, year} in the lineup must be assigned either to a particular position or as the designated hitter. Any {player, year} can be assigned as the designated hitter. No position can be used more than once, and all 8 positions must be accounted for. The positions each player is permitted to play are given in the XML file. Again, the same real life player can appear in the lineup multiple times, as long as each time is for a different year AND EACH IS AT A DIFFERENT POSITION.
  • The lineup must maximize TB + BB + SB + R + RBI - K. Spelled out, the formula reads:

total bases (i.e. hits + doubles + (2 * triples) + (3 * home runs)) + walks + stolen bases + runs + rbi - strikeouts

  • The XML generated should contain the lineup and include interesting auxiliary information for each {player, year}. Include information in the output that a consumer might find useful, including team(s),relevant hitting statistics, and composite hitting statistics such as "batting average" (hits / at-bats) and "on base percentage" ((hits + walks) / (at-bats + walks)).

Back to top

 

 
  Contact Us


See all jobs