0
votes

I am trying to think of an algorithm that always produces the optimum solution in the best possible time to this problem:

There are n candidates for a job, and k rooms in which they have scheduled interviews at various times of the day. Interviews have a specific schedule in each room, with each interview having a specified start time (si), finish time (fi), and interview room (ri). All time units are always integers. In addition we need to schedule pictures with the people currently being interviewed throughout the day. The pictures don't effectively take any time, but at some point in the day each interviewee must be in a picture. If we schedule a picture at time t, all people currently being interviewed will be in that picture. Taking a picture has no affect on the rest of each interviews start and end time. So the problem is this: with an unordered list of interviews , each with variables (si, fi, ri), how do you make sure every interview candidate is in a picture, while taking as few pictures as possible?

So ideally we would take pictures when there are as many people present as possible to minimize the number of pictures taken. My original idea for this was sort of a brute force, but it would be a really bad big-O runtime. It is very important to minimize the runtime of this algorithm while still returning the fewest possible photographs. That being said, if you can think of a fast greedy algorithm that doesn't perfectly solve the problem, I would like to hear that too.

I'm sure my description here was far from flawless, so if you would like me to clarify anything, feel free to leave a comment and I'll get back to you.

2
To clarify, when you take a picture, it includes all interviewees in all rooms? If so, does the room information affect the produced solution in any way?radiaph
@phari Yes. All interviewees in all rooms are in the picture. The room information is there because sometimes there will be someone in a room and sometimes there won't. There can not be more than 1 person in a room. So if we have 5 rooms, then there are at most 5 people being interviewed, but there could be 0. Does this answer your question?Herp Derp

2 Answers

1
votes

Start with the following observations:

  1. At least one picture must be taken during each interview, since we cannot photograph that interviewee before they arrive or after they leave.
  2. The set of people available to photograph changes only at the times si and fi.
  3. After an arrival event si, if the next event j is an arrival, there is no need to take a picture between si and sj, since everyone available at si is still available at sj.
  4. Therefore, you can let the set of available interviewees "build up" through arrival events (up to k of them) and wait to take a picture until someone is about to leave.

Thus I think the following algorithm should work:

  1. Put the arrival and departure times into a list and sort it (times should remain tagged with "arrival" or "departure" and the interviewee's index).
  2. Create a boolean array A of size n to keep track of whether each interviewee is available (interview is in progress).
  3. Create a boolean array P of size n to keep track of whether each interviewee has been photographed.
  4. Loop over the sorted time list (index variable i):

    a. If an arrival is encountered, set A[i] to true.

    b. If a departure j is encountered, check P[j] to see if the person leaving has been photographed already. If not, take a picture now and record its effects (for all A[k] = true set P[k] = true). Finally set A[i] to false.

The sort is O(n log n), the loop has 2n iterations, and checking the arrays is O(1). But since on each picture-taking event, you may need to loop over A, the overall runtime is O(n2) in the worst case (which would happen if no interviews overlapped in time).

0
votes

Here's an O(n log n) solution:

Step 1: Separately sort the starting and finishing time of all interviews, but at the same time keep track of the places they are sorted to (i.e. the original indices and the indices after sort). This results in 4 arrays below

  • sst[] (sst = sorted starting time)

  • sft[] (sft = sorted finishing time)

  • sst2orig[] (sst index to original index)

  • sft2orig[] (sst index to original index)

    Note: by definitions of the above 4 arrays, "sst2orig[j] = i & sst2orig[k] = i" means that interview [i] has starting time sst[j] and finishing time sft[k]

Step 2: Define a boolean array p_taken[] to represent if the candidate of an interview has already been phtographed. All elements in the array will be set to false initially.

Step 3: The loop

std::vector<int> photo_time;
int last_p_not_taken_sst_index = 0;
for (int i=0; i<sft.size; i++) {
  // ignore the candidate already photographed
  if (p_taken[sft2orig[sft[i]]]) continue;
    
  // Now we found the first leaving candidate not phtographed, we
  // must take a photo now.
  photo_time.push_back(sft[i]);
    
  // So we can now mark all candidate having prior sst[] time as
  // already photographed.  So, we search for the first elm. in
  // sst[] that is greater than sft[i], and returns the index.   
  // If all elm. in sst[] is smaller than sft[i], we return sst.size().
  // This could be done via a binary search
  int k = upper_inequal_bound_index(sst, sft[i]);
  
  // now we can mark all candidate with starting time prior than sst[k]
  // to be "photographed".  This will include the one corresponding to
  // sft[i]
  for (int j=last_p_not_taken_sst_index; j<k; j++)
     p_taken[sst2orig[j]] = true;
  last_p_not_taken_sst_index = k;
}

The final answer is saved in photo_time, and the number of photos is photo_time.size().

Time Complexity:

Step 1: Sorts: O(n log n)

Step 2: initialize p_taken[]: O(n)

Step 3: We loop n times, and in each loop

3-1 check p_taken: O(1)

3-2 binary search: O(log n)

3-3 mark candidates: aggreated O(n), since we mark once only, per candidate.

So, overall for step 3: O(n x ( 1 + log n) + n) = O(n log n)

Step 1 ~ 3, total: O(n log n)

Note that step 3 can be futher optimized: we can shrink to exclude those already previous binary-searched range. But the worst case is still O(log n) per loop. Thus the total is still O(n log n)