-
Notifications
You must be signed in to change notification settings - Fork 334
Expand file tree
/
Copy pathbdldfp_decimal.h
More file actions
8586 lines (7607 loc) · 340 KB
/
bdldfp_decimal.h
File metadata and controls
8586 lines (7607 loc) · 340 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
// bdldfp_decimal.h -*-C++-*-
#ifndef INCLUDED_BDLDFP_DECIMAL
#define INCLUDED_BDLDFP_DECIMAL
#include <bsls_ident.h>
BSLS_IDENT("$Id$")
//@PURPOSE: Provide IEEE-754 decimal floating-point types.
//
//@CLASSES:
// bdldfp::Decimal32: 32bit IEEE-754 decimal floating-point type
// bdldfp::Decimal64: 64bit IEEE-754 decimal floating-point type
// bdldfp::Decimal128: 128bit IEEE-754 decimal floating-point type
// bdldfp::DecimalNumGet: Stream Input Facet
// bdldfp::DecimalNumPut: Stream Output Facet
//
//@MACROS:
// BDLDFP_DECIMAL_DF: Portable Decimal32 literal macro
// BDLDFP_DECIMAL_DD: Portable Decimal64 literal macro
// BDLDFP_DECIMAL_DL: Portable Decimal128 literal macro
//
//@SEE_ALSO: bdldfp_decimalutil, bdldfp_decimalconvertutil,
// bdldfp_decimalplatform
//
//@DESCRIPTION: This component provides classes that implement decimal
// floating-point types that conform in layout, encoding and operations to the
// IEEE-754 2008 standard. This component also provides two facets to support
// standard C++ streaming operators as specified by ISO/IEC TR-24733:2009.
// These classes are 'bdldfp::Decimal32' for 32-bit Decimal floating point
// numbers, 'bdldfp::Decimal64' for 64-bit Decimal floating point numbers, and
// 'bdldfp::Decimal128' for 128-bit decimal floating point numbers.
//
// Decimal encoded floating-point numbers are important where exact
// representation of decimal fractions is required, such as in financial
// transactions. Binary encoded floating-point numbers are generally optimal
// for complex computation but cannot exactly represent commonly encountered
// numbers such as 0.1, 0.2, and 0.99.
//
// NOTE: Interconversion between binary and decimal floating-point values is
// fraught with misunderstanding and must be done carefully and with intent,
// taking into account the provenance of the data. See the discussion on
// conversion below and in the 'bdldfp_decimalconvertutil' component.
//
// The BDE decimal floating-point system has been designed from the ground up
// to be portable and support writing portable decimal floating-point user
// code, even for systems that do not have compiler or native library support
// for it; while taking advantage of native support (such as ISO/IEC TR
// 24732 - C99 decimal TR) when available.
//
// 'bdldfp::DecimalNumGet' and 'bdldfp::DecimalNumPut' are IO stream facets.
//
///Floating-Point Primer
///---------------------
// There are several ways of represent numbers when using digital computers.
// The simplest would be an integer format, however such a format severely
// limits the range of numbers that can be represented; and it cannot represent
// real (non-integer) numbers directly at all. Integers might be used to
// represent real numbers of limited precision by treating them as a multiple
// of the real value being represented; these are often known as fixed-point
// numbers. However general computations require higher precision and a larger
// range than integer and fixed point types are able to efficiently provide.
// Floating-point numbers provide what integers cannot. They are able to
// represent a large range of real values (although not precisely) while using
// a fixed (and reasonable) amount of storage.
//
// Floating-point numbers are constructed from a set of significant digits of a
// radix on a sliding scale, where their position is determined by an exponent
// over the same radix. For example let's see some 32bit decimal (radix 10)
// floating-point numbers that have maximum 7 significant digits (significand):
//..
// Significand | Exponent | Value |
// -------------+----------+--------------+ In the Value column you may
// 1234567 | 0 | 1234567.0 | observer how the decimal point
// 1234567 | 1 | 12345670.0 | is "floating" about the digits
// 1234567 | 2 | 123456700.0 | of the significand.
// 1234567 | -1 | 123456.7 |
// 1234567 | -2 | 12345.67 |
//..
// Floating-point numbers are standardized by IEEE-754 2008, in two major
// flavors: binary and decimal. Binary floating-point numbers are supported by
// most computer systems in the forms of the 'float', 'double' and
// 'long double' fundamental data types. While they are not required to be
// binary that is almost always the choice on modern binary computer
// architectures.
//
///Floating-Point Peculiarities
/// - - - - - - - - - - - - - -
// Floating-point approximation of real numbers creates a deliberate illusion.
// While it looks like we are working with real numbers, floating-point
// encodings are not able to represent real numbers precisely since they have a
// restricted number of digits in the significand. In fact, a 64 bit
// floating-point type can represent fewer distinct values than a 64 bit binary
// integer. Yet, because floating-point encodings can represent numbers over a
// much larger range, including extremely small (fractional) numbers, they are
// useful in practice.
//
// Floating-point peculiarities may be split into three categories: those that
// are due to the (binary) radix/base, those that are inherent properties of
// any floating-point representation and finally those that are introduced by
// the IEEE-754 2008 standard. Decimal floating-point addresses the first set
// of surprises only; so users still need to be aware of the rest.
//
//: 1 Floating-point types cannot exactly represent every number in their
//: range. The consequences are surprising and unexpected for the newcomer.
//: For example: when using binary floating-point numbers, the following
//: expression is typically *false*: '0.1 + 0.2 == 0.3'. The problem is not
//: limited to binary floating-point. Decimal floating-point cannot
//: represent the value of one third exactly.
//:
//: 2 Unlike with real numbers, the order of operations on floating-point
//: numbers is significant, due to accumulation of round off errors.
//: Therefore floating-point arithmetic is neither commutative nor
//: transitive. E.g., 2e-30 + 1e30 - 1e-30 - 1e30 will typically produce 0
//: (unless your significand can hold 60 decimal digits). Alternatively,
//: 1e30 - 1e30 + 2e-30 - 1e-30 will typically produce 1e-30.
//:
//: 3 IEEE floating-point types can have special values: negative zero,
//: negative and positive infinity; and they can be NaN (Not a Number, in two
//: variants: quiet or signaling). A NaN (any variant) is never equal to
//: anything else - including NaN or itself!
//:
//: 4 In IEEE floating-point there are at least two representations of 0, the
//: positive zero and negative zero. Consequently unary - operators change
//: the sign of the value 0; therefore leading to surprising results: if
//: 'f == 0.0' then '0 - f' and '-f' will not result in the same value,
//: because '0 - f' will be +0.0' while '-f' will be -0.0.
//:
//: 5 Most IEEE floating-point operations (like arithmetic) have implicit input
//: parameters and output parameters (that do not show up in function
//: signatures. The implicit input parameters are called *attributes* by
//: IEEE while the outputs are called status flags. The C/C++ programming
//: language defines a so-called floating-point environment that contains
//: those attributes and flags ('<fenv.h>' C and '<cfenv>' C++ headers). To
//: learn more about the floating point environment read the subsection of
//: the same title, but first make sure you read the next point as well.
//:
//: 6 IEEE floating-points overloads some very common programming language
//: terms: *exception*, *signal* and *handler* with IEEE floating-point
//: specific meanings that are not to be confused with C or C++ or Posix
//: terms of the same spelling. Floating-point exceptions are events that
//: occur when a floating-point operations on the specified operands is
//: unable to produce a perfect outcome; such as when the result of an
//: operation is inexact. When a floating point exception occurs the
//: (floating-point) - and reporting it is requested by a so-called trap
//: attribute - the implementation signals the user(*) by invoking a default
//: or a user-defined handler. None of the words *exception*, *signal*, and
//: *handler* used above have nothing to do with C++ exceptions, Posix
//: signals and the handlers of those. (To complicate matters more, C and
//: Posix has decided to implement IEEE floating-point exception reporting as
//: C/Posix signals - and therefore rendered them mostly useless.)
//:
//: 7 While a 32bit integer is a quite useful type for (integer) calculations,
//: a 32bit floating-point type has such low accuracy (its significand is so
//: short) that it is all but useless for calculation. Such types are called
//: "interchange formats" by the IEEE standard and should not be used for
//: calculations. (Except in special circumstances and by floating-point
//: experts. Even a 16 bit binary floating-point type can be useful for an
//: expert in special circumstances, for example in graphics acceleration
//: hardware.)
//
// Notes:
// (*) IEEE Floating-point user is any person, hardware or software that uses
// the IEEE floating-point implementation.
//
///Floating-Point Environment
/// - - - - - - - - - - - - -
// NOTE: We currently do not give access to the user to the floating-point
// environment used by our decimal system, so description of it here is
// preliminary and generic. Note that since compilers and the C library
// already provides a (possibly binary floating-point only) environment and we
// cannot change that, our decimal floating-point environment implementation
// cannot conform to the C and C++ TRs (because those require extending the
// existing standard C library functions).
//
// The floating-point environment provides implicit input and output parameters
// to floating-point operations (that are defined to use them). IEEE defined
// those parameters in principle, but how they are provided is left up to be
// designed/defined by the implementors of the programming languages.
//
// C (and consequently C++) decided to provide a so-called floating-point
// environment that has "thread storage duration", meaning that each thread of
// a multi-threaded program will have its own distinct floating-point
// environment.
//
// The C/C++ floating-point environment consists of 3 major parts: the rounding
// mode, the traps and the status flags.
//
///Rounding Direction in The Environment
///- - - - - - - - - - - - -
// A floating-point *rounding direction* determines how is the significand of a
// higher (or infinite) precision number get rounded to fit into the limited
// number of significant digits (significand) of the floating-point
// representation that needs to store it as a result of an operation. Note
// that the rounding is done in the radix of the representation, so binary
// floating-point will do binary rounding while decimal floating-point will do
// decimal rounding - and not all rounding modes are useful with all radixes.
// An example of a generally applicable rounding mode would be 'FE_TOWARDZERO'
// (round towards zero).
//
// Most floating point operations in C and C++ do not take a rounding direction
// parameter (and the ones that are implemented as operators simply could not).
// When such operations (that do not have an explicit rounding direction
// parameter) need to do rounding, they use the rounding direction set in the
// floating-point environment (of their thread of execution).
//
///Status Flags
/// - - - -
// Floating point operations in C and C++ do not take a status flag output
// parameter. They report an important events (such as underflow, overflow or
// in inexact (rounded) result) by setting the appropriate status flag in the
// floating-point environment (of their thread of execution). (Note that this
// is very similar to how flags work in CPUs, and that is not a coincidence.)
// The flags work much like individual, boolean 'errno' values. Operations may
// set them to true. Users may examine them (when interested) and also reset
// them (set them to 0) before an operation.
//
///Floating-Point Traps
/// - - - - - - -
// IEEE says that certain floating-point events are floating-point exceptions
// and they result in invoking a handler. It may be a default handler (set a
// status flag and continue) or a user defined handler. Floating point traps
// are a C invention to enable "sort-of handlers" for floating point
// exceptions, but unfortunately they all go to the same handler: the 'SIGFPE'
// handler. To add insult to injury, setting what traps are active (what will)
// cause a 'SIGFPE') is not standardized. So floating-point exceptions and
// handlers are considered pretty much useless in C. (All is not lost, since
// we do have the status flags. An application that wants to know about
// floating-point events can clear the flags prior to an operation and check
// their values afterwards.)
//
/// Error Reporting
///- - - - - -
// The 'bdldfp_decimalutil' utility component provides a set of decimal math
// functions that parallel those provided for binary floating point in the C++
// standard math library. Errors during computation of these functions (e.g.,
// domain errors) will be reported through the setting of 'errno' as described
// in the "Status Flags" section above. (Note that this method of reporting
// errors is atypical for BDE-provided interfaces, but matches the style used
// by the standard functions.)
//
///Floating-Point Terminology
/// - - - - - - - - - - - - -
// A floating-point representation of a number is defined as follows:
// 'sign * significand * BASE^exponent', where sign is -1 or +1, significand is
// an integer, BASE is a positive integer (but usually 2 or 10) and exponent is
// a negative or positive integer. Concrete examples of (decimal) numbers in
// the so-called scientific notation are: 123.4567 is 1.234567e2, while
// -0.000000000000000000000000000000000000001234567 would be -1.234567e-41.
//
//: "base":
//: the number base of the scaling used by the exponent; and by the
//: significand
//:
//: "bias":
//: the number added to the exponent before it is stored in memory; 101, 398
//: and 6176 for the 32, 64 and 128 bit types respectively.
//:
//: "exponent":
//: the scaling applied to the significand is calculated by raising the base
//: to the exponent (which may be also negative)
//:
//: "quantum":
//: (IEEE-754) the value of one unit at the last significant digit
//: position; in other words the smallest difference that can be
//: represented by a floating-point number without changing its exponent.
//:
//: "mantissa":
//: the old name for the significand
//:
//: "radix":
//: another name for base
//:
//: "sign":
//: +1 or -1, determines if the number is positive or negative. It is
//: normally represented by a single sign bit.
//:
//: "significand":
//: the significant digits of the floating-point number; the value of the
//: number is: 'sign * significand * base^exponent'
//:
//: "precision":
//: the significant digits of the floating-point type in its base
//:
//: "decimal precision":
//: the maximum significant decimal digits of the floating-point type
//:
//: "range":
//: the smallest and largest number the type can represent. Note that for
//: floating-point types there are at least *two* interpretations of
//: minimum. It may be the largest negative number *or* the smallest number
//: in absolute value) that can be represented.
//:
//: "normalized number":
//: '1 <= significand <= base'
//:
//: "normalization":
//: finding the exponent such as '1 <= significand <= base'
//:
//: "denormal number":
//: 'significand < 1'
//:
//: "densely packed decimal":
//: one of the two IEEE significand encoding schemes
//:
//: "binary integer significand":
//: one of the two IEEE significand encoding schemes
//:
//: "cohorts":
//: equal numbers encoded using different exponents (to signify accuracy)
//
///Decimal Floating-Point
///----------------------
// Binary floating-point formats give best accuracy, they are the fastest (on
// binary computers), and were carefully designed by IEEE to minimize rounding
// errors (errors due to the inherent imprecision of floating-point types)
// during a lengthy calculation. This makes them the best solution for and
// serious scientific computation. However, they have a fatal flow when it
// comes to numbers and calculations that involve humans. Humans think in base
// 10 - decimal. And as the example has shown earlier, binary floating-point
// formats are unable to precisely represent very common decimal real numbers;
// with binary floating-point '0.1 + 0.2 != 0.3'. (Why? Because none of the
// three numbers in that expression have an exact binary floating-point
// representation.)
//
// Financial calculations are governed by laws and expectations that are based
// on decimal (10 based) thinking. Due to the inherent limitations of the
// binary floating-point format, doing such decimal based calculations and
// algorithms using binary floating-point numbers is so involved and hard that
// that it is considered not feasible. The IEEE-754 committee have recognized
// the issue and added specifications for 3 decimal floating-point types into
// their 2008 standard: the 32, 64 and 128 bits decimal floating-point formats.
//
// Floating-point types are carefully designed trade-offs between saving space
// (in memory), CPU cycles (for calculations) and still provide useful accuracy
// for computations. Decimal floating-point types represent further
// compromises (compared to binary floating-points) in being able to represent
// less numbers (than their binary counterparts) and being slower, but
// providing exact representations for the numbers humans care about.
//
// In decimal floating-point world '0.1 + 0.2 == 0.3', as humans expect;
// because each of those 3 numbers can be represented *exactly* in a decimal
// floating-point format.
//
///*WARNING*: Conversions from 'float' and 'double'
/// - - - - - - - - - - - - - - - - - - - - - - - -
// Clients should *be* *careful* when using the conversions from 'float' and
// 'double' provided by this component. In situations where a 'float' or
// 'double' was originally obtained from a decimal floating point
// representation (e.g., a 'bdldfp::Decimal', or a string, like "4.1"), the
// conversions in 'bdldfp_decimalconvertutil' will provide the correct
// conversion back to a decimal floating point value. The conversions in this
// component provide the closest decimal floating point value to the supplied
// binary floating point representation, which may replicate imprecisions
// required to initially approximate the value in a binary representation.
// The conversions in this component are typically useful when converting
// binary floating point values that have undergone mathematical operations
// that require rounding (so they are already in-exact approximations).
//
///Cohorts
///- - - -
// In the binary floating-point world the formats are optimized for the highest
// precision, range and speed. They are stored normalized and therefore store
// no information about their accuracy. In finances, the area that decimal
// floating-point types target, accuracy of a number is usually very important.
// We may have a number that is 1, but we know it may be 1.001 or 1.002 etc.
// And we may have another number 1, which we know to be accurate to 6
// significant digits. We would display the former number as '1.00' and the
// latter number as '1.00000'. The decimal floating-point types are able to
// store both numbers *and* their precision using so called cohorts. The
// '1.00' will be stored as '100e-2' while '1.00000' will be stored as
// '100000e-5'.
//
// Cohorts compare equal, and mostly behave the same way in calculation except
// when it comes to the accuracy of the result. If I have a number that is
// accurate to 5 digits only, it would be a mistake to try to expect more than
// 5 digits accuracy from a calculation involving it. The IEEE-754 rules of
// cohorts (in calculations) ensures that results will be a cohort that
// indicates the proper expected accuracy.
//
///Standards Conformance
///---------------------
// The component has also been designed to resemble the C++ Decimal
// Floating-Point Technical Report ISO/IEC TR-24733 of 2009 and its C++11
// updates of ISO/IEC JTC1 SC22 WG21 N3407=12-0097 of 2012 as much as it is
// possible with C++03 compilers and environments that do not provide decimal
// floating-point support in any form.
//
// At the time of writing there is just one standard about decimal-floating
// point, the IEEE-754 2008 standard and the content of this component conforms
// to it. The component does not fully implement all required IEEE-754
// functionality because due to our architectural design guidelines some of
// these must go into a separate so-called utility component.)
//
// The component uses the ISO/IEC TR 24732 - the C Decimal Floating-Point
// TR - in its implementation where it is available.
//
// The component closely resembles ISO/IEC TR 24733 - the C++ Decimal
// Floating-Point TR - but does not fully conform to it for several reasons.
// The major reasons are: it is well known that TR 24733 has to change before
// it is included into the C++ standard; the TR would require us to change
// system header files we do not have access to.
//
// In the following subsections the differences to the C++ technical report are
// explained in detail, including a short rationale.
//
///No Namespace Level Named Functions
/// - - - - - - - - - - - - - - - - -
// BDE design guidelines do not allow namespace level functions other than
// operators and aspects. According to BDE design principles all such
// functions are placed into a utility component.
//
///All Converting Constructors from Integer Types are Explicit
///- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// This change is necessary to disable the use of comparison operators without
// explicit casting. See No Heterogeneous Comparisons Without Casting.
//
///No Heterogeneous Comparisons Without Casting
/// - - - - - - - - - - - - - - - - - - - - - -
// The C and C++ Decimal TRs refer to IEEE-754 for specifications of the
// heterogeneous comparison operators (comparing decimal floating-point types
// to binary floating-point types and integer types); however IEEE-754 does
// *not* specify such operations - leaving them unspecified. To make matters
// worse, there are two possible ways to implement those operators (convert the
// decimal to the other type, or convert the other type to decimal first) and
// depending on which one is chosen, the result of the operator will be
// different. Also, the C committee is considering the removal of those
// operators. We have removed them until we know how to implement them.
// Comparing decimal types to those other types is still possible, it just
// requires explicit casting/conversion from the user code.
//
///Arithmetic And Computing Support For 'Decimal32'
/// - - - - - - - - - - - - - - - - - - - - - - - - -
// IEEE-754 designates the 32 bit floating-point types "interchange formats"
// and does not require or recommend arithmetic or computing support of any
// kind for them. The C (and consequently the C++) TR goes against the IEEE
// design and requires '_Decimal32' (and 'std::decimal32') to provide computing
// support, however, in a twist, allows it to be performed using one of the
// larger types (64 or 128 bits). The rationale from the C committee is that
// small embedded systems may need to do their calculations using the small
// type (so they have made it mandatory for everyone). To conform the
// requirement we provide arithmetic and computing support for Decimal32 type
// but users need to be aware of the drawbacks of calculations using the small
// type. Industry experience with the 'float' C type (32bit floating-point
// type, usually binary) has shown that enabling computing using small
// floating-point types are a mistake that causes novice programmers to write
// calculations that are very slow and inaccurate.
//
// We recommend what IEEE recommends: convert your 32 bit types on receipt to a
// type with higher precision (usually 64 bit will suffice), so you
// calculations using that larger type, and convert it back to 32 bit type only
// if your output interchange format requires it.
//
///Non-Standard Member Functions
///- - - - - - - - - - - - - - -
// Due to BDE rules of design and some implementation needs we have extended
// the C++ TR mandated interface of the decimal floating-point types to include
// support for accessing the underlying data (type), to parse literals for the
// portable literal support.
//
// Note that using any of these public member functions will render your code
// non-portable to non-BDE (but standards conforming) implementations.
//
///'Decimal32' Type
///----------------
// A basic format type that supports input, output, relational operators
// construction from the TR mandates data types and arithmetic or operations.
// The type has the size of exactly 32 bits. It supports 7 significant decimal
// digits and an exponent range of -95 to 96. The smallest non-zero value that
// can be represented is 1e-101.
//
// Portable 'Decimal32' literals are created using the 'BDLDFP_DECIMAL_DF'
// macro.
//
///'Decimal64' Type
///----------------
// A basic format type that supports input, output, relational operators
// construction from the TR mandates data types and arithmetic or operations.
// The type has the size of exactly 64 bits. It supports 16 significant
// decimal digits and an exponent range of -383 to 384. The smallest non-zero
// value that can be represented is 1e-398.
//
// Portable 'Decimal64' literals are created using the 'BDLDFP_DECIMAL_DD'
// macro.
//
///'Decimal128' Type
///-----------------
// A basic format type that supports input, output, relational operators
// construction from the TR mandates data types and arithmetic or operations.
// The type has the size of exactly 128 bits. It supports 34 significant
// decimal digits and an exponent range of -6143 to 6144. The smallest
// non-zero value that can be represented is 1e-6176.
//
// Portable 'Decimal128' literals are created using the 'BDLDFP_DECIMAL_DL'
// macro.
//
///Decimal Number Formatting
///-------------------------
// Streaming decimal floating point numbers to an output stream supports
// formatting flags for width, capitalization and justification and flags used
// to output numbers in natural, scientific and fixed notations. When
// scientific or fixed flags are set then the precision manipulator specifies
// how many digits of the decimal number are to be printed, otherwise all
// significant digits of the decimal number are output using native notation.
//
///User-defined literals
///---------------------
// The user-defined literal 'operator "" _d32', 'operator "" _d64', and
// 'operator "" _d128' are declared for the 'bdldfp::Decimal32',
// 'bdldfp::Decimal64', and 'bdldfp::Decimal128' types respectively . These
// user-defined literal suffixes can be applied to both numeric and string
// literals, (i.e., 1.2_d128, "1.2"_d128 or "inf"_d128) to produce a decimal
// floating-point value of the indicated type by parsing the argument string
// or numeric value:
//..
// using namespace bdldfp::DecimalLiterals;
//
// bdldfp::Decimal32 d0 = "1.2"_d32;
// bdldfp::Decimal32 d1 = 1.2_d32;
// assert(d0 == d1);
//
// bdldfp::Decimal64 d2 = "3.45678901234"_d64;
// bdldfp::Decimal64 d3 = 3.45678901234_d64;
// assert(d2 == d3);
//
// bdldfp::Decimal128 inf = "inf"_d128;
// bdldfp::Decimal128 nan = "nan"_d128;
//..
// The operators providing literals are available in the
// 'BloombergLP::bdldfp::literals::DecimalLiterals' namespace (where 'literals'
// and 'DecimalLiterals' are both inline namespaces). Because of inline
// namespaces, there are several viable options for a using declaration, but
// *we* *recommend* 'using namespace bdldfp::DecimalLiterals', which minimizes
// the scope of the using declaration.
//
// Note that the parsing follows the rules as specified for the 'strtod32',
// 'strtod64' and 'strtod128' functions in section 9.6 of the ISO/EIC TR 247128
// C Decimal Floating-Point Technical Report.
//
// Also note that these operators can be used only if the compiler supports
// C++11 standard.
//
///Usage
///-----
// In this section, we show the intended usage of this component.
//
///Example 1: Portable Initialization of Non-Integer, Constant Values
/// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// If your compiler does not support the C Decimal TR, it does not support
// decimal floating-point literals, only binary floating-point literals. The
// problem with binary floating-point literals is the same as with binary
// floating-point numbers in general: they cannot represent the decimal numbers
// we care about. To solve this problem there are 3 macros provided by this
// component that can be used to initialize decimal floating-point types with
// non-integer values, precisely. These macros will evaluate to real, C
// language literals where those are supported and to a runtime-parsed solution
// otherwise. The following code demonstrates the use of these macros as well
// as mixed-type arithmetics and comparisons:
//..
// bdldfp::Decimal32 d32( BDLDFP_DECIMAL_DF(0.1));
// bdldfp::Decimal64 d64( BDLDFP_DECIMAL_DD(0.2));
// bdldfp::Decimal128 d128(BDLDFP_DECIMAL_DL(0.3));
//
// assert(d32 + d64 == d128);
// assert(bdldfp::Decimal64(d32) * 10 == bdldfp::Decimal64(1));
// assert(d64 * 10 == bdldfp::Decimal64(2));
// assert(d128 * 10 == bdldfp::Decimal128(3));
//..
//
///Example 2: Precise Calculations with Decimal Values
///- - - - - - - - - - - - - - - - - - - - - - - - - -
// Suppose we need to add two (decimal) numbers and then tell if the result is
// a particular decimal number or not. That can get difficult with binary
// floating-point, but easy with decimal:
//..
// if (std::numeric_limits<double>::radix == 2) {
// assert(.1 + .2 != .3);
// }
// assert(BDLDFP_DECIMAL_DD(0.1) + BDLDFP_DECIMAL_DD(0.2)
// == BDLDFP_DECIMAL_DD(0.3));
//..
#include <bdlscm_version.h>
#include <bdldfp_decimalimputil.h>
#include <bdldfp_decimalstorage.h>
#include <bslh_hash.h>
#include <bslma_default.h>
#include <bslmf_istriviallycopyable.h>
#include <bslmf_nestedtraitdeclaration.h>
#include <bsls_assert.h>
#include <bsls_compilerfeatures.h>
#include <bsls_keyword.h>
#include <bsls_libraryfeatures.h>
#include <bsls_platform.h>
#include <bsl_cstddef.h>
#include <bsl_cstring.h>
#include <bsl_ios.h>
#include <bsl_iosfwd.h>
#include <bsl_iterator.h>
#include <bsl_limits.h>
#include <bsl_locale.h>
#ifndef BDE_DONT_ALLOW_TRANSITIVE_INCLUDES
#include <bslalg_typetraits.h>
#endif // BDE_DONT_ALLOW_TRANSITIVE_INCLUDES
// Portable decimal floating-point literal support
#define BDLDFP_DECIMAL_DF(lit) \
BloombergLP::bdldfp::Decimal32(BDLDFP_DECIMALIMPUTIL_DF(lit))
#define BDLDFP_DECIMAL_DD(lit) \
BloombergLP::bdldfp::Decimal64(BDLDFP_DECIMALIMPUTIL_DD(lit))
#define BDLDFP_DECIMAL_DL(lit) \
BloombergLP::bdldfp::Decimal128(BDLDFP_DECIMALIMPUTIL_DL(lit))
namespace BloombergLP {
namespace bdldfp {
// FORWARD DECLARATIONS
class Decimal_Type32;
class Decimal_Type64;
class Decimal_Type128;
// These are the actual (decimal floating-point) types being implemented.
// They use a different name to cause an error if the official types are
// forward declared: The exact definition of the decimal types is left
// unspecified so that that can potentially be aliases for built-in types.
typedef Decimal_Type32 Decimal32;
typedef Decimal_Type64 Decimal64;
typedef Decimal_Type128 Decimal128;
// The decimal floating-point types are typedefs to the unspecified
// implementation types.
// THE DECIMAL FLOATING-POINT TYPES
// ====================
// class Decimal_Type32
// ====================
class Decimal_Type32 {
// This value-semantic class implements the IEEE-754 32 bit decimal
// floating-point interchange format type. This class is a standard layout
// type that is 'const' thread-safe and exception-neutral.
private:
// DATA
DecimalImpUtil::ValueType32 d_value; // The underlying IEEE representation
public:
// CLASS METHODS
// Aspects
static int maxSupportedBdexVersion();
static int maxSupportedBdexVersion(int versionSelector);
// Return the maximum valid BDEX format version, as indicated by the
// specified 'versionSelector', to be passed to the 'bdexStreamOut'
// method. Note that it is highly recommended that 'versionSelector'
// be formatted as "YYYYMMDD", a date representation. Also note that
// 'versionSelector' should be a *compile*-time-chosen value that
// selects a format version supported by both externalizer and
// unexternalizer. See the 'bslx' package-level documentation for more
// information on BDEX streaming of value-semantic types and
// containers.
// TRAITS
BSLMF_NESTED_TRAIT_DECLARATION(Decimal_Type32, bsl::is_trivially_copyable);
// CREATORS
Decimal_Type32();
// Create a 'Decimal32_Type' object having the value positive zero and
// the smallest exponent value.
Decimal_Type32(DecimalImpUtil::ValueType32 value); // IMPLICIT
// Create a 'Decimal32_Type' object having the specified 'value'.
explicit Decimal_Type32(Decimal_Type64 other);
explicit Decimal_Type32(Decimal_Type128 other);
// Create a 'Decimal32_Type' object having the value closest to the
// value of the specified 'other' following the conversion rules as
// defined by IEEE-754:
//
//: o If 'other' is NaN, initialize this object to a NaN.
//:
//: o Otherwise if 'other' is infinity (positive or negative), then
//: initialize this object to infinity with the same sign.
//:
//: o Otherwise if 'other' has a zero value, then initialize this
//: object to zero with the same sign.
//:
//: o Otherwise if 'other' has an absolute value that is larger than
//: 'std::numeric_limits<Decimal32>::max()' then store the value of
//: the macro 'ERANGE' into 'errno' and initialize this object to
//: infinity with the same sign as 'other'.
//:
//: o Otherwise if 'other' has an absolute value that is smaller than
//: 'std::numeric_limits<Decimal32>::min()' then store the value of
//: the macro 'ERANGE' into 'errno' and initialize this object to
//: zero with the same sign as 'other'.
//:
//: o Otherwise if 'other' has a value that has more significant digits
//: than 'std::numeric_limits<Decimal32>::max_digit' then initialize
//: this object to the value of 'other' rounded according to the
//: rounding direction.
//:
//: o Otherwise initialize this object to the value of the 'other'.
explicit Decimal_Type32(float other);
explicit Decimal_Type32(double other);
// Create a 'Decimal32_Type' object having the value closest to the
// value of the specified 'other' value. *Warning:* clients requiring
// a conversion for an exact decimal value should use
// 'bdldfp_decimalconvertutil' (see *WARNING*: Conversions from
// 'float' and 'double'}. This conversion follows the conversion
// rules as defined by IEEE-754:
//
//: o If 'other' is NaN, initialize this object to a NaN.
//:
//: o Otherwise if 'other' is infinity (positive or negative), then
//: initialize this object to infinity value with the same sign.
//:
//: o Otherwise if 'other' has a zero value, then initialize this
//: object to zero with the same sign.
//:
//: o Otherwise if 'other' has an absolute value that is larger than
//: 'std::numeric_limits<Decimal32>::max()' then store the value of
//: the macro 'ERANGE' into 'errno' and initialize this object to
//: infinity with the same sign as 'other'.
//:
//: o Otherwise if 'other' has an absolute value that is smaller than
//: 'std::numeric_limits<Decimal32>::min()' then store the value of
//: the macro 'ERANGE' into 'errno' and initialize this object to
//: zero with the same sign as 'other'.
//:
//: o Otherwise if 'other' has a value that has more significant digits
//: than 'std::numeric_limits<Decimal32>::max_digit' then initialize
//: this object to the value of 'other' rounded according to the
//: rounding direction.
//:
//: o Otherwise initialize this object to the value of the 'other'.
explicit Decimal_Type32(int other);
explicit Decimal_Type32(unsigned int other);
explicit Decimal_Type32(long int other);
explicit Decimal_Type32(unsigned long int other);
explicit Decimal_Type32(long long other);
explicit Decimal_Type32(unsigned long long other);
// Create a 'Decimal32_Type' object having the value closest to the
// value of the specified 'other' following the conversion rules as
// defined by IEEE-754:
//
//: o If 'value' is zero then initialize this object to a zero with an
//: unspecified sign and an unspecified exponent.
//:
//: o Otherwise if 'other' has a value that is not exactly
//: representable using 'std::numeric_limits<Decimal32>::max_digit'
//: decimal digits then initialize this object to the value of
//: 'other' rounded according to the rounding direction.
//:
//: o Otherwise initialize this object to the value of 'other' with
//: exponent 0.
//! Decimal32_Type(const Decimal32_Type& original) = default;
// Create a 'Decimal32_Type' object that is a copy of the specified
// 'original' as defined by the 'copy' operation of IEEE-754 2008:
//
//: o If 'other' is NaN, initialize this object to a NaN.
//:
//: o Otherwise initialize this object to the value of the 'other'.
//
// Note that since floating-point types may be NaN, and NaNs are
// unordered (do not compare equal even to themselves) it is possible
// that a copy of a decimal will not compare equal to the original;
// however it will behave as the original.
//! ~Decimal32_Type() = default;
// Destroy this object.
// MANIPULATORS
//! Decimal32_Type& operator=(const Decimal32_Type& rhs) = default;
// Make this object a copy of the specified 'rhs' as defined by the
// 'copy' operation of IEEE-754 2008 and return a reference providing
// modifiable access to this object.
//
//: o If 'other' is NaN, set this object to a NaN.
//:
//: o Otherwise set this object to the value of the 'other'.
//
// Note that since floating-point types may be NaN, and NaNs are
// unordered (do not compare equal even to themselves) it is possible
// that, after an assignment, a decimal will not compare equal to the
// original; however it will behave as the original.
Decimal_Type32& operator++();
// Add 1.0 to the value of this object and return a reference to it.
// Note that this is a floating-point value so this operation may not
// change the value of this object at all (if the value is large) or it
// may just set it to 1.0 (if the original value is small).
Decimal_Type32& operator--();
// Add -1.0 to the value of this object and return a reference to it.
// Note that this is a floating-point value so this operation may not
// change the value of this object at all (if the value is large) or it
// may just set it to -1.0 (if the original value is small).
Decimal_Type32& operator+=(Decimal32 rhs);
Decimal_Type32& operator+=(Decimal64 rhs);
Decimal_Type32& operator+=(Decimal128 rhs);
// Add the value of the specified 'rhs' object to the value of this as
// described by IEEE-754, store the result in this object, and return a
// reference to this object.
//
//: o If either of this object or 'rhs' is signaling NaN, then store
//: the value of the macro 'EDOM' into 'errno' and set this object to
//: a NaN.
//:
//: o Otherwise if either of this object or 'rhs' is NaN then set this
//: object to a NaN.
//:
//: o Otherwise if this object and 'rhs' have infinity value of
//: differing signs, store the value of the macro 'EDOM' into 'errno'
//: and set this object to a NaN.
//:
//: o Otherwise if this object and 'rhs' have infinite values of the
//: same sign, then do not change this object.
//:
//: o Otherwise if 'rhs' has a zero value (positive or negative), do
//: not change this object.
//:
//: o Otherwise if the sum of this object and 'rhs' has an absolute
//: value that is larger than 'std::numeric_limits<Decimal32>::max()'
//: then store the value of the macro 'ERANGE' into 'errno' and
//: set this object to infinity with the same sign as that result.
//:
//: o Otherwise set this object to the sum of the number represented by
//: 'rhs' and the number represented by this object.
//
// Note that this is a floating-point value so this operations may not
// change the value of this object at all (if the value is large) or it
// may seem to update it to the value of the 'other' (if the original
// value is small).
//
// Also note that when 'rhs' is a 'Decimal64', this operation is
// always performed with 64 bits precision to prevent loss of
// precision of the 'rhs' operand (prior to the operation). The
// result is then rounded back to 32 bits and stored to this object.
// See IEEE-754 2008, 5.1, first paragraph, second sentence for
// specification.
//
// Also note that when 'rhs' is a 'Decimal128', this operation is
// always performed with 128 bits precision to prevent loss of
// precision of the 'rhs' operand (prior to the operation). The
// result is then rounded back to 32 bits and stored to this object.
// See IEEE-754 2008, 5.1, first paragraph, second sentence for
// specification.
Decimal_Type32& operator+=(int rhs);
Decimal_Type32& operator+=(unsigned int rhs);
Decimal_Type32& operator+=(long rhs);
Decimal_Type32& operator+=(unsigned long rhs);
Decimal_Type32& operator+=(long long rhs);
Decimal_Type32& operator+=(unsigned long long rhs);
// Add the specified 'rhs' to the value of this object as described by
// IEEE-754, store the result in this object, and return a reference to
// this object.
//
//: o If this object is signaling NaN, then store the value of the
//: macro 'EDOM' into 'errno' and set this object to a NaN.
//:
//: o Otherwise if this object is NaN, then do not change this object.
//:
//: o Otherwise if this object is infinity, then do not change it.
//:
//: o Otherwise if the sum of this object and 'rhs' has an absolute
//: value that is larger than 'std::numeric_limits<Decimal32>::max()'
//: then store the value of the macro 'ERANGE' into 'errno' and
//: set this object to infinity with the same sign as that result.
//:
//: o Otherwise set this object to sum of adding 'rhs' and the number
//: represented by this object.
//
// Note that this is a floating-point value so this operations may not
// change the value of this object at all (if the value is large) or it
// may seem to update it to the value of the 'other' (if the original
// value is small).
//
// Also note that this operation is always performed with 64 bits
// precision to prevent loss of precision of the 'rhs' operand (prior
// to the operation). The result is then rounded back to 32 bits and
// stored to this object. See IEEE-754 2008, 5.1, first paragraph,
// second sentence for specification.
Decimal_Type32& operator-=(Decimal32 rhs);
Decimal_Type32& operator-=(Decimal64 rhs);
Decimal_Type32& operator-=(Decimal128 rhs);
// Subtract the value of the specified 'rhs' from the value of this
// object as described by IEEE-754, store the result in this object,
// and return a reference to this object.
//
//: o If this object is signaling NaN, then store the value of the
//: macro 'EDOM' into 'errno' and set this object to a NaN.
//:
//: o Otherwise if either of this object or 'rhs' is NaN then set this
//: object to a NaN.
//:
//: o Otherwise if this object and 'rhs' have infinity value of the
//: same signs, store the value of the macro 'EDOM' into 'errno'
//: and set this object to a NaN.
//:
//: o Otherwise if this object and the 'rhs' have infinite values of
//: differing signs, then do not change this object.
//:
//: o Otherwise if the 'rhs' has a zero value (positive or negative),
//: do not change this object.
//:
//: o Otherwise if subtracting the value of the 'rhs' object from this
//: results in an absolute value that is larger than
//: 'std::numeric_limits<Decimal32>::max()' then store the value of
//: the macro 'ERANGE' into 'errno' and set this object to infinity
//: with the same sign as that result.
//:
//: o Otherwise set this object to the result of subtracting the value
//: of 'rhs' from the value of this object.
//
// Note that this is a floating-point value so this operations may not
// change the value of this object at all (if the value is large) or it
// may seem to update it to the value of the 'other' (if the original
// value is small).
//
// Also note that when 'rhs' is a 'Decimal64', this operation is
// always performed with 64 bits precision to prevent loss of
// precision of the 'rhs' operand (prior to the operation). The
// result is then rounded back to 32 bits and stored to this object.
// See IEEE-754 2008, 5.1, first paragraph, second sentence for
// specification.
//
// Also note that when 'rhs' is a 'Decimal128', this operation is
// always performed with 128 bits precision to prevent loss of
// precision of the 'rhs' operand (prior to the operation). The
// result is then rounded back to 32 bits and stored to this object.
// See IEEE-754 2008, 5.1, first paragraph, second sentence for
// specification.
Decimal_Type32& operator-=(int rhs);
Decimal_Type32& operator-=(unsigned int rhs);
Decimal_Type32& operator-=(long rhs);
Decimal_Type32& operator-=(unsigned long rhs);
Decimal_Type32& operator-=(long long rhs);
Decimal_Type32& operator-=(unsigned long long rhs);
// Subtract the specified 'rhs' from the value of this object as
// described by IEEE-754, store the result in this object, and return a
// reference to this object.
//
//: o If this object is signaling NaN, then store the value of the
//: macro 'EDOM' into 'errno' and set this object to a NaN.
//:
//: o Otherwise if this object is NaN, then do not change this object.
//:
//: o Otherwise if this object is infinity, then do not change it.
//:
//: o Otherwise if subtracting 'rhs' from this object's value results
//: in an absolute value that is larger than
//: 'std::numeric_limits<Decimal32>::max()' then store the value of
//: the macro 'ERANGE' into 'errno' and set this object to infinity
//: with the same sign as that result.
//:
//: o Otherwise set this object to the result of subtracting 'rhs' from
//: the value of this object.
//
// Note that this is a floating-point value so this operations may not
// change the value of this object at all (if the value is large) or it
// may seem to update it to the value of the 'other' (if the original
// value is small).
//
// Also note that this operation is always performed with 64 bits
// precision to prevent loss of precision of the 'rhs' operand (prior
// to the operation). The result is then rounded back to 32 bits and
// stored to this object. See IEEE-754 2008, 5.1, first paragraph,
// second sentence for specification.
Decimal_Type32& operator*=(Decimal32 rhs);
Decimal_Type32& operator*=(Decimal64 rhs);
Decimal_Type32& operator*=(Decimal128 rhs);
// Multiply the value of the specified 'rhs' object by the value of
// this as described by IEEE-754, store the result in this object, and
// return a reference to this object.
//