README
tab2xml
The goal of tab2xml is to convert spreadsheet files
(.xlsx or .ods) into structured XML documents using a predefined
template. The package processes the spreadsheet data, replacing template
tokens with corresponding values, and manages foreign key relationships
automatically.
Installation
You can install the released version of tab2xml from CRAN with:
install.packages ("tab2xml" )
And the development version from GitHub with:
# install.packages("pak")
pak:: pak ("josesamos/tab2xml" )
Example
This example demonstrates how to use tab2xml to convert
an Excel or ODS file into an XML file, based on a schema example from
the Mondrian
Documentation .
library (tab2xml)
# Define file paths
source_xml <- system.file ("extdata" , "schema_template.xml" , package = "tab2xml" )
source_xlsx <- system.file ("extdata" , "schema.xlsx" , package = "tab2xml" )
temp_file <- tempfile (fileext = ".xml" )
Show spreadsheet contents
sheet_names <- readxl:: excel_sheets (source_xlsx)
for (sheet in sheet_names) {
cat (" \n #### Sheet:" , sheet)
data <- readxl:: read_excel (source_xlsx, sheet = sheet)
print (knitr:: kable (data))
}
Sheet: Cube
Sheet: Table
1
sales_fact_1997
2
customer
3
time_by_day
Sheet: Dimension
1
Gender
customer_id
1
2
Time
time_id
1
Sheet: Hierarchy
1
Gender
true
allMemberName=“All Genders”
customer_id
1
2
2
Time
false
NA
time_id
2
3
Sheet: Level
Gender
gender
String
true
1
Year
the_year
Numeric
true
2
Quarter
quarter
Numeric
false
2
Month
month_of_year
Numeric
false
2
Sheet: Measure
Unit Sales
unit_sales
sum
#,###
1
Store Sales
store_sales
sum
#,###.##
1
Store Cost
store_cost
sum
#,###.00
1
Sheet: CalculatedMember
1
Profit
Measures
[Measures].[Store Sales] -
[Measures].[Store Cost]
1
Sheet:
CalculatedMemberProperty
FORMAT_STRING
$#,##0.00
1
Convert spreadsheet to XML
file <- sheet2xml (source_xlsx, source_xml, temp_file, optimize = TRUE )
Check output
library (xml2)
#> Warning: package 'xml2' was built under R version 4.4.3
xml_content <- readLines (file, warn = FALSE )
cat ("```xml \n " , paste (xml_content, collapse = " \n " ), " \n ```" , sep = "" )
<?xml version= "1.0" encoding= "UTF-8" ?>
<Schema >
<Cube name= "Sales" >
<Table name= "sales_fact_1997" />
<Dimension name= "Gender" foreignKey= "customer_id" >
<Hierarchy name= "Gender" hasAll= "true" allMemberName= "All Genders" primaryKey= "customer_id" >
<Table name= "customer" />
<Level name= "Gender" column= "gender" type= "String" uniqueMembers= "true" />
</Hierarchy >
</Dimension >
<Dimension name= "Time" foreignKey= "time_id" >
<Hierarchy name= "Time" hasAll= "false" primaryKey= "time_id" >
<Table name= "time_by_day" />
<Level name= "Year" column= "the_year" type= "Numeric" uniqueMembers= "true" />
<Level name= "Quarter" column= "quarter" type= "Numeric" uniqueMembers= "false" />
<Level name= "Month" column= "month_of_year" type= "Numeric" uniqueMembers= "false" />
</Hierarchy >
</Dimension >
<Measure name= "Unit Sales" column= "unit_sales" aggregator= "sum" formatString= "#,###" />
<Measure name= "Store Sales" column= "store_sales" aggregator= "sum" formatString= "#,###.##" />
<Measure name= "Store Cost" column= "store_cost" aggregator= "sum" formatString= "#,###.00" />
<CalculatedMember name= "Profit" dimension= "Measures" formula= "[Measures].[Store Sales] - [Measures].[Store Cost]" >
<CalculatedMemberProperty name= "FORMAT_STRING" value= "$#,##0.00" />
</CalculatedMember >
</Cube >
</Schema >
In this way, we can organize and work with the data in tabular form
and generate XML documents directly using the provided templates.