You need to know the intrinsic parameters of the camera to do that.
Considers the z=0 plane. The point
X=(x,y,0,1)'
is projected to the image as
p=P*X.
Now use the decomposition
P=K[R t],
where K is the calibration matrix and [R t] are extrinsic parameters. Since z=0, the third column vector of R is multiplied by zero. We can now drop the 3rd column to get
p=K*[r1 r2 t]*(x,y,1)=H*(x,y,1),
where H is a planar homography.
You have already computed H from e.g. known points. The first and second column of R and the vector t can now be recovered
[r1 r2 t]=inv(K)*H.
Make sure that r1 and r2 are unit length, then t is the correct translation vector. The third column vector of R can be recovered because R is orthogonal, for example using the cross product.
r3=cross(r1,r2).
Since H is a measurement, the r1 and r2 you computed are not exact. You can use the SVD for obtaining the closest rotation matrix to a measurement. You can then compose a projection matrix
P=K[r1 r2 r3 t]
which projects any 3D point in the coordinate frame based on your 2D coordinate system of the homograohy.
Here is some course material, which describes this situation.
https://www.dropbox.com/s/qkulg4j64lyn0qa/2018_proj_geo_for_cv_projcv_assignment.pdf?dl=0
Here is a related question.
Computing camera pose with homography matrix based on 4 coplanar points
As @nbsrujan (thanks) pointed out, for those using OpenCV, there is a function which can decompose a homography into translation and rotation matrices given the intrinsics.