Update records and set values

Question

I have a table called Transaction which contains some columns: [TransactionID, Type(credit or debit), Amount, Cashout, CreditPaid, EndTime]

Customers can get lots of credit and these transactions are stored in the transactions table. If a customer pays at the end of the month an amount which covers some or all of the credit transactions, I want those transactions to be updated. If the total payment covers some transactions, then the transactions should be updated.

For example, a customer pays in 300. If the transaction 'Amount' is 300 and 'Type' is credit then the 'CreditPaid' amount should be 300. (This is a simple update statement) but...

If there are two transactions i.e. one 300 and another 400 and are both credit and the monthly payment amount is 600, then the oldest transaction should be paid 300 in full, and the next transaction 300 leaving 100 outstanding.

Any ideas how to do this?

TrID  Buyin Type    Cashout CustID  StartTime  EndTime  AddedBy CreditPaid
72    200   Credit    0 132 2013-05-21 NULL     NULL    NULL
73    300   Credit    0 132 2013-05-22 NULL     NULL    NULL
75    400   Credit    0 132 2013-05-23 NULL     NULL    NULL

Desired Results after customer pays 600

TrID  Buyin Type    Cashout CustID  StartTime  EndTime  AddedBy CreditPaid
72    200   Credit     0    132 2013-05-21 2013-05-24   NULL    200
73    300   Credit     0    132 2013-05-22 2013-05-24   NULL    300
75    400   Credit     0    132 2013-05-23 NULL     NULL    100

You really need to elaborate with sample data and your expected results. — Gordon Linoff
Although it sounds like a completely different question, there is a very useful example that can probably be adapted here: stackoverflow.com/questions/10327741/… — bendataclear
What's EndTime for? If that records when a credit is paid off, what happens if it's paid in installments? Personally, I think I'd want more history of when things were paid, meaning another table. — Clockwork-Muse
Think about this: when the customer pays $300 and owes $700 (the 300 + 400), are they paying off the $300 or just paying against the total owed? In most cases they are paying against the total owed, so what is paid is not mapped to individual transactions, but simply to the total. — thursdaysgeek
Your suggestions are really helpful. How do you think I should handle this? ClockworkMuse mentioned another table. The think is that I want to make is as convinient as possible for the user who won't be bothered to go through all the transactions and pay each one individually. So by adding a big lump sum into the system and auto-update all the credit transactions I thought it would be the best solution. * EndTime is the date of the payment of the outstanding debt. — alwaysVBNET

ErikE ErikE · Accepted Answer · 2013-05-24T23:48:38

Here's a SQL 2008 version:

CREATE PROCEDURE dbo.PaymentApply
   @CustID int,
   @Amount decimal(11, 2),
   @AsOfDate datetime
AS
WITH Totals AS (
   SELECT
      T.*,
      RunningTotal =
         Coalesce (
            (SELECT Sum(S.Buyin - Coalesce(S.CreditPaid, 0))
            FROM dbo.Trans S
            WHERE
               T.CustID = S.CustID
               AND S.Type = 'Credit'
               AND S.Buyin < Coalesce(S.CreditPaid, 0)
               AND (
                  T.Starttime > S.Starttime
                  OR (
                     T.Starttime = S.Starttime
                     AND T.TrID > S.TrID
                  )
               )
            ),
        0)
   FROM
      dbo.Trans T
   WHERE
      CustID = @CustID
      AND T.Type = 'Credit'
      AND T.Buyin < Coalesce(T.CreditPaid, 0)
)
UPDATE T
SET
   T.EndTime = P.EndTime,
   T.CreditPaid = Coalesce(T.CreditPaid, 0) + P.CreditPaid
FROM
   Totals T
   CROSS APPLY (
      SELECT TOP 1
         V.*
      FROM
         (VALUES
            (T.Buyin - Coalesce(T.CreditPaid, 0), @AsOfDate),
            (@Amount - RunningTotal, NULL)
         ) V (CreditPaid, EndTime)
      ORDER BY
         V.CreditPaid,
         V.EndTime DESC
   ) P
WHERE
   T.RunningTotal <= @Amount
   AND @Amount > 0;
;

See a Live Demo at SQL Fiddle

Or, for anyone using SQL 2012, you can replace the contents of the CTE with a better-performing and simpler query using the new windowing functions:

SELECT
   *,
   RunningTotal =
      Sum(Buyin - Coalesce(CreditPaid, 0)) OVER(
         ORDER BY StartTime
         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
      ) - Buyin
FROM dbo.Trans
WHERE
   CustID = @CustID
   AND Type = 'Credit'
   AND Buyin - Coalesce(CreditPaid, 0) > 0

See a Live Demo at SQL Fiddle

Here's how they work:

We calculate the running total for all the prior rows where the CreditPaid amount is less than the Buyin amount. Note this does NOT include the current row.
From this we can determine what portion of the payment will apply to each row and which rows will be involved in the payment. If the sum of all the credits for all the prior rows are higher than the payment, then this row will NOT be included, thus T.RunningTotal <= @Amount. That's because all the prior rows will fully consume the payment by this point, so we can stop applying it.
For each row where we will apply a payment, we want to pay as much as possible, but we have to pay attention to the last row where we may not be paying the full amount (as is the case with the third credit in the example). So we'll be paying one of two amounts: the full credit amount (with more rows to receive payments) or only the portion left over which could be less than the full credit for that row (and this is the last row). We accomplish this by taking the lesser of either 1) the full remaining Buyin - CreditPaid amount, or 2) what's left of the full amount @Amount - RunningTotalOfPriorRows. I could have done this as a CASE expression, but I like using the Min function, especially because we would have had to do two CASE expressions to also determine whether to also update the EndTime column (per your requirements).
The SQL 2012 version simply calculates the same thing as the 2008 version: the sum of Buyin - CreditPaid for all the prior rows, using a windowing function instead of a correlated subquery.
Finally, we perform the update to all rows where the RunningTotal is less than the amount to be applied (since if it were equal to the amount, there would be no payment left for the current row).

Now, there are some larger considerations that you should think about.

Some of your scheme I like--I am not convinced that, as some commenters have said, you should ignore the individual transactions. I think that handling the individual transactions can be very important. It's much like how hospitals have one medical record number for each patient (MRN) but open a new account / file / visit each time the patient has a service performed. Each account is treated separately, and this is for many reasons, including--and this is where it seems important for you, too--the need for the customer to understand what exactly is comprising the total. It can be shocking to see the total all added up, but when this is broken out into individual transactions on individual dates, this makes a lot more sense to people and they can begin to understand exactly how they spent more money than they remembered at the time. "You owe me 600 bucks" can be harder to face than "your transactions for $100, $300, and $200 are still unpaid". :)

So, on to some big considerations here.

If you go with the theory that a transactional or balance-based account starts at 0 as a sort of "anchor", and to find the current balance you simply have to add up all the transactions: well, this does indeed satisfy relational theory, but in practice it is completely unworkable because it does not provide a fast, accurate way to get the current balance. It is imperative to have the current balance saved as a discrete value. If you were a bank, how would you know how much money you had, without adding up perhaps dozens of years of transaction history each time? Instead, it may be better to think of the current balance as the "anchor" (instead of 0) and think of the transactions as going backward in time. Additionally, there is no harm in recording periodic balances. Banks do this by closing out periods into statements, with a defined balance as of each statement closing date. There is no need to go all the way back to zero, since you don't care too much about the balance at the old, unanchored end of the history. You know that eventually every account started at 0. Big deal!

Given these thoughts, it is important for you to have a table where the customer's total account balance is simply stated. You also need a place to record his payments, refunds, cancellations, and so on. This should be separate from the accounts (in your case, transactions) themselves, because there is not a one-to-one correspondence between payment transactions and credit transactions. Already in your current scheme you have partially paid transactions with no date recorded--this is a huge gap in the system that will come back to bite you. What if a customer paid $10 a day toward a $200 credit for 20 days? 19 of those payments would show no date paid.

What I recommend, then, is that you create a stored procedure (SP) that applies payments to totals first, and then create another one that will "rewrite" the payments into the transactions in an on-demand way. Think about what a credit card company has to do if they "re-rate" your account. Perhaps they acted on incorrect information and increased your interest rate on a certain date. (This actually happened to me. I proved to them that the collections activity they were responding to was not my fault--it had been retracted by the original company after I showed them that one of their staff had mistakenly changed my mailing address, and I had never received a bill to be able to pay. So they had to be able to re-run all the purchase/debit/interest rate calculations on my account retroactively, to recalculate everything after the original change date based on the correct interest rate.) Think about this a bit and you will see that it is quite possible to operate this way, as long as you design your system properly. Your SP is given a date range or set of transactions within which it is allowed to work, and then "rewrites" history as if the old history had never existed.

But, you don't actually want to destroy history, so this is further complicated by the fact that at one point in time, your best knowledge of the customer's account balance for a time period was a different amount than your current best knowledge of their account balance for that time period--both are true data and need to be kept.

Let's say you discover that your system occasionally doubled up Credit transactions mistakenly. When you fix the customer data, you need to be able to see the fact that they had the problem, even though they don't have it now. This is done by using additional date columns EffectiveDate and ExpirationDate--or whatever you want to call them. Then, these need to be part of the clustered index, and used on every query to always get the current values. I highly recommend using 9999-12-31 instead of NULL as your ExpirationDate value for current rows--this will have a huge positive impact on performance when querying for current data. I also recommend putting the ExpirationDate as the first column in the clustered index (or at least, before the EffectiveDate column), since history will always potentially have many more records than the future, so it will be more selective than EffectiveDate being first (think a bit: all past knowledge will have EffectiveDate =< GetDate() but only current or future data will have ExpirationDate > GetDate()). To drive the point home: you don't delete. You expire old rows by setting a column to the date the knowledge became obsolete, and you insert new rows representing the new knowledge, with a column showing the date you learned this information and having an indefinitely-open "to the future" value in the other date column.

And finally a couple of single points:

The CreditPaid column should be NOT NULL with a default of 0. I had to throw in a bunch of Coalesces to deal with the NULLs.
You need to handle overpayments somehow. Either by preventing them, or by storing the overpaid portion value and applying it later. You could OUTPUT the results of the UPDATE statement into a table, then select the Sum from this and make the SP return any unused payment value. There are many ways to handle this. If you build the "re-rate" SP as I suggested, then this won't be too much of a problem, as you can rerun it after receiving new transactions (then immediately (re)apply all payments for any open periods).

At this point I can't go on too much more, but I hope that these thoughts help you. Your design is a good start, but it needs some work to get it to the point where it will function well as an enterprise-quality system.

UPDATE

I corrected a glitch in the 2008 version (adding the conditions from the outer query to the subquery).

And here's my last edit (all: please do not edit this answer again or it will be converted to community wiki).

If you do go with a scheme where rows are marked with the dates they are understood to be true (EffectiveDate and ExpirationDate), you can make coding in your system a little easier by creating inline table functions that select only the active rows from the table WHERE EffectiveDate <= GetDate() AND GetDate() < ExpirationDate. Pay careful attention to the comparison operators you're using (e.g., <= vs <), and use date ranges that are inclusive at the start and exclusive at the end. If you aren't sure what that means, please do look these terms up and understand them before proceeding. You want to be able to change the resolution of your date data type in the future, without breaking any of your queries. If you use an inclusive end date, this will not be possible. There are many posts online talking about how to properly query for dates in SQL.

Something like this:

CREATE FUNCTION dbo.TransCurrent
RETURNS TABLE
AS
RETURN (
   SELECT *
   FROM dbo.Trans
   WHERE
      EffectiveDate <= GetDate()
      AND GetDate() < ExpirationDate --make clustered index have this first!
);

Do NOT confuse this with a multi-statement table-value-returning function. That will NOT perform well. This function type here will perform well because it can be inlined into the query, where basically the engine takes the logical intent of what the function is doing, and disposes with the function call entirely. Using any other kind of function will defeat this and your performance will go into the pot as your table grows in size.

Update records and set values

1 Answers