0

DB-Fiddle

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    country VARCHAR(255),
    sales_date DATE,
    sales_volume DECIMAL,
    fix_costs DECIMAL
);

INSERT INTO sales
(country, sales_date, sales_volume, fix_costs
)
VALUES 

('DE', '2020-01-03', '500', '2000'),
('FR', '2020-01-03', '350', '2000'),
('None', '2020-01-31', '0', '2000'),

('DE', '2020-02-15', '0', '5000'),
('FR', '2020-02-15', '0', '5000'),
('None', '2020-02-29', '0', '5000'),

('DE', '2020-03-27', '180', '4000'),
('FR', '2020-03-27', '970', '4000'),
('None', '2020-03-31', '0', '4000');

Expected Result:

sales_date   |   country    |   sales_volume   |     fix_costs
-------------|--------------|------------------|------------------------------------------
2020-01-03   |     DE       |       500        |     37.95  (= 2000/31 = 64.5 x 0.59)
2020-01-03   |     FR       |       350        |     26.57  (= 2000/31 = 64.5 x 0.41)
-------------|--------------|------------------|------------------------------------------
2020-02-15   |     DE       |         0        |     86.21  (= 5000/28 = 172.4 x 0.50)  
2020-02-15   |     FR       |         0        |     86.21  (= 5000/28 = 172.4 x 0.50)  
-------------|--------------|------------------|------------------------------------------    
2020-03-27   |     DE       |       180        |     20.20  (= 4000/31 = 129.0 x 0.16) 
2020-03-27   |     FR       |       970        |    108.84  (= 4000/31 = 129.0 x 0.84)   
-------------|--------------|------------------|-------------------------------------------

The column fix_costs in the expected result is calculated as the following:

Step 1) Get the daily rate of the fix_costs per month.(2000/31 = 64.5; 5000/29 = 172.4; 4000/31 = 129.0)

Step 2) Split the daily value to the countries DE and FR based on their share in the sales_volume. (500/850 = 0.59; 350/850 = 0.41; 180/1150 = 0.16; 970/1150 = 0.84)

Step 3) In case the sales_volume is 0 the daily rate gets split 50/50 to DE and FR as you can see for 2020-02-15.


In MariaDB I was able to this with the below query:

SELECT
s.sales_date, 
s.country,
s.sales_volume,

       (CASE WHEN SUM(sales_volume) OVER (PARTITION BY sales_date) > 0
             THEN ((s.fix_costs/ DAY(LAST_DAY(sales_date))) *
                   sales_volume / NULLIF(SUM(sales_volume) OVER (PARTITION BY sales_date), 0)
                  )
             ELSE (s.fix_costs / DAY(LAST_DAY(sales_date))) * 1 / SUM(country <> 'None') OVER (PARTITION by sales_date)
        END) AS imputed_fix_costs
        
FROM sales s
WHERE country <> 'None'
GROUP BY 1,2,3
ORDER BY 1;

However, in PostgresSQL I get an error on DAY(LAST_DAY(sales_date)).
I tried to replace this part with (date_part('DAY', ((date_trunc('MONTH', s.sales_date) + INTERVAL '1 MONTH - 1 DAY')::date)))
However, this is causing another error.

How do I need to modify the query to get the expected result?

Michi
  • 4,663
  • 6
  • 33
  • 83

2 Answers2

2

The Postgresql equivalent of DAY(LAST_DAY(sales_date)) would be:

extract(day from (date_trunc('month', sales_date + interval '1 month') - interval '1 day'))

The expression SUM(country <> 'None') also needs to be fixed as

SUM(case when country <> 'None' then 1 else 0 end)

It might be a good idea to define this compatibility function:

create function last_day(d date) returns date as
$$
  select date_trunc('month', d + interval '1 month') - interval '1 day';
$$ language sql immutable;

Then the first expression becomes simply

extract(day from last_day(sales_date))
Stefanov.sm
  • 11,215
  • 2
  • 21
  • 21
0

I would create a function to return the last day (number) for a given date - which is actually the "length" of the month.

create function month_length(p_input date) 
  returns integer
as
$$
  select extract(day from (date_trunc('month', p_input) + interval '1 month - 1 day'));
$$
language sql 
immutable;

Then the query can be written as:

select sales_date, country, 
       sum(sales_volume),
       sum(fix_costs_per_day * cost_factor) 
from (       
  select id, country, sales_date, sales_volume, fix_costs, 
         fix_costs / month_length(sales_date) as fix_costs_per_day,
         case 
            when sum(sales_volume) over (partition by sales_date) > 0 
              then sales_volume::numeric / sum(sales_volume) over (partition by sales_date)
            else sales_volume::numeric / 2
         end as cost_factor
  from sales
  where country <> 'None'
) t
group by sales_date, country
order by sales_date, country