Appendix I: Multimerge

This function merges a set of sorted lists into a single sorted list. The inputs to multimerge() are:

  • start[] – an array of the starting addresses of the sorted lists,
  • lengths[] – an array of the lengths of the sorted lists,
  • Number – the number of sorted lists,
  • newArray[] – the array where to place the sorted list
  • newArrayLength – the length of the array newArray[]

Trace through the following figure that works through the initial steps of merging four sorted lists into one. A priority queue plays an important part of this algorithm.

Really great picture

It is important that each of the input sorted lists is, in fact, sorted. If any of these lists is not sorted, the output will not be sorted.

Essentially the sorted lists are merge by repeated moving the smallest element from the lists to the new list. The process of choosing this smallest element is performed very efficiently by a priority queue.

Questions

  1. Work through the next several steps of the multimerge shown in the figure.
  2. The priority queue in the figure is drawn as a binary tree. How is this binary tree most often implemented for the priority queue? Draw this representation for each of the priority queues.

Complexity

Let \(L = \sum_{i=0}^{Number}Lengths[i]\). The time complexity to multimerge these sorted lists is \(\Theta(L\log(Number))\), and the space complexity is \(2L+Number\).

multimerge.h

This is the header file for multimerge().

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
// Header file for multimerge.cpp
// David John
// (c) Wake Forest University, 2013

#ifndef MYDATA
#define MYDATA

#include <vector>
#include <queue>

using namespace std;

// prototype for multimerge function
int multimerge(int * start[], const int lengths[], const int Number, int newArray[], const int newArrayLength);

#endif

multimerge.cc

This is the c++ source code for multimerge(). The structure mmdata provides for simple use of the priority queue.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
// multimerge.cpp
// David John
// (c) Wake Forest University, 2013

// Function to merge Number sorted arrays into one sorted array
//
// The addresses and lengths of the sorted arrays are passed via *start[] and lengths[]
//    start[j][k]  represents the kth value in the ith array


#include "multimerge.h"

// This structure is what is placed in the priority queue: an index to the 
//		appropriate array,  an index to an element in the appropriate array, and 
//		the value of stored at the index of the element in the appropriate array
struct mmdata 
{
	int stindex;
	int index;
	int stvalue;

	mmdata(int st=0, int id=0, int stv = 0):stindex(st),index(id),stvalue(stv){}

};


// comparison operator
bool operator<( const mmdata & One, const mmdata & Two)
{
	return One.stvalue > Two.stvalue;
}

int multimerge(int * starts[], const int lengths[], const int Number, 
			   int newArray[], const int newArrayLength)
{
 	// Create priority queue.  There will be at most one item in the priority queue
 	// for each of the Number lists.
 	priority_queue< mmdata> priorities;

 	// Examine each of the Number start[] lists, place the first location into 
	// the priority 	queue if the list is not empty
 	for(int i=0; i<Number;i++)
 	{
		if (lengths[i]>0)
		{
			priorities.push(mmdata(i,0,starts[i][0]));
		}
	}


	// As long as priorities is not empty, pull off the top member (the smallest 
	//value from list i), push it into the newArray, and place the next element from 
	// list i in the priority queue
	int newArrayindex = 0;  // index into the merged array
	while (!priorities.empty() && (newArrayindex<newArrayLength))
	{
		// grab the smallest element, and remove it from the priority queue
		mmdata xxx = priorities.top();
		priorities.pop();

		// insert this smallest element into the merged array
		newArray[newArrayindex++] = starts[xxx.stindex][xxx.index];

		// if start[xxx.stindex] is not empty, place the next member into priority
		if ( lengths[xxx.stindex]>(xxx.index+1))
		{
			priorities.push(mmdata(xxx.stindex, xxx.index+1, 
								starts[xxx.stindex][xxx.index+1]));
		}
}

// return the logical size of the merged array
return newArrayindex;
}

Questions

  1. What exactly are the functions of mmdata in the program?
  2. How does a priority queue work?
  3. On line 55, what are the two boolean expressions in the condition checking for? Why?

Table Of Contents

Previous topic

Parallel sorting by regular sampling: analysis

Next topic

Appendix II: Utilities

This Page