So I'm working on some pixel shaders for good old emulators like Super Nintendo. You have the classic algorithms like HQnx, 2xSaI, etc, and they are definitely written to run on CPUs and be scaled exactly to twice the size before blitting to the screen.
Moving on to GPU fragment shaders, these algorithms can be done essentially for free. I'm working with OpenGL and Cg/GLSL, but this question should apply to Direct3D/HLSL coders as well.
The major problem is that these algorithms blend against neighboring pixels using some algorithm to decide on the color. However, I found this concept quite hard with shader languages. Generally with fragment shaders you are able to get a floating point texture coordinate, which you can use to do texture lookups, usually with GL_LINEAR used as texture filter. Most pixels shaders use GL_NEAREST, and do the smoothing themselves.
The problem occurs if I want to find, say, the exact neighbor pixel. I've seen some implementations but they occasionally cause artifacts on the screen. Probably due to floating point inaccuracies that occur. I've found that most of the artifacts simply disappear when using power-of-two sized textures, which further strengthens my belief there are floating point inaccuracies going on. Here is a sample fragment shader in Cg that shows the issues:
struct output
{
float4 color : COLOR;
};
struct input
{
float2 video_size;
float2 texture_size;
float2 output_size;
};
struct deltas
{
float2 UL, UR, DL, DR;
};
output main_fragment (float2 tex : TEXCOORD0, uniform input IN, uniform sampler2D s_p : TEXUNIT0)
{
float2 texsize = IN.texture_size;
float dx = pow(texsize.x, -1.0) * 0.25;
float dy = pow(texsize.y, -1.0) * 0.25;
float3 dt = float3(1.0, 1.0, 1.0);
deltas VAR = {
tex + float2(-dx, -dy),
tex + float2(dx, -dy),
tex + float2(-dx, dy),
tex + float2(dx, dy)
};
float3 c00 = tex2D(s_p, VAR.UL).xyz;
float3 c20 = tex2D(s_p, VAR.UR).xyz;
float3 c02 = tex2D(s_p, VAR.DL).xyz;
float3 c22 = tex2D(s_p, VAR.DR).xyz;
float m1=dot(abs(c00-c22),dt)+0.001;
float m2=dot(abs(c02-c20),dt)+0.001;
output OUT;
OUT.color = float4((m1*(c02+c20)+m2*(c22+c00))/(2.0*(m1+m2)),1.0);
return OUT;
}
Is there some way to make sure that we can grab the color data from the pixel we expect and not a different one? I believe this problem occurs since we might be query a pixel from a coordinate that is just between two pixels (if that makes sense). Hopefully there is some built-in function into these shader languages I'm overlooking.