This gives you two sets, each of three equations in 3 variables:
a*x0+b*y0+c*z0 = x0'
a*x1+b*y1+c*z1 = x1'
a*x2+b*y2+c*z2 = x2'
d*x0+e*y0+f*z0 = y0'
d*x1+e*y1+f*z1 = y1'
d*x2+e*y2+f*z2 = y2'
Just use whatever method of solving simultaneous equations is easiest in your situation (it isn't even hard to solve these "by hand"). Then your transformation matrix is just ((a,b,c)(d,e,f)).
...
Actually, that is over-simplified and assumes a camera pointed at the origin of your 3D coordinate system and no perspective.
For perspective, the transformation matrix works more like:
( a, b, c, d ) ( xt )
( x, y, z, 1 ) ( e, f, g, h ) = ( yt )
( i, j, k, l ) ( zt )
( xv, yv ) = ( xc+s*xt/zt, yc+s*yt/zt ) if md < zt;
but the 4x3 matrix is more constrained than 12 degrees of freedom since we should have
a*a+b*b+c*c = e*e+f*f+g*g = i*i+j*j+k*k = 1
a*a+e*e+i*i = b*b+f*f+j*j = c*c+g*g+k*k = 1
So you should probably have 4 points to get 8 equations to cover the 6 variables for camera position and angle and 1 more for scaling of the 2-D view points since we'll be able to eliminate the "center" coordinates (xc,yc).
So if you have 4 points and transform your 2-D view points to be relative to the center of your display, then you can get 14 simultaneous equations in 13 variables and solve.
Unfortunately, six of the equations are not linear equations. Fortunately, all of the variables in those equations are restricted to the values between -1 and 1 so it is still probably feasible to solve the equations.